OptiGov Project

Public administrations operate within intricate and evolving regulatory ecosystems, making it difficult to ensure that their process executions and documents fully meet all the legal requirements. The OptiGov Project harnesses AI — through Process Mining and Large Language Models — to automatically check and align these processes with their corresponding legal rules, ensuring compliance and transparency.

Topics of the project

AI, Process Mining, PM, Automated Reasoning, Compliance Checking, Formal Methods

News

(12/11/2025) "IGFEJ promotes a presentation session of the OPTIGOV and INFRAGOV projects" - In Portuguese: https://igfej.justica.gov.pt/Noticias-do-IGFEJ/IGFEJ-promove-sessao-de-apresentacao-dos-projetos-OPTIGOV-E-INFRAGOV

(28/01/2026) The OptiGov team participated, by invitation, in the event "AI in the Public Sector: Review and Future", an initiative by APDC with NOVA University Lisbon - In Portuguese: https://www.apdc.pt/iniciativas/agenda-apdc/conferencia-ia-no-setor-publico-balanco-e-futuro

(29/01/2026) All source code and datasets for the LLM-based component (ARCCS) are available at the following link: https://github.com/geofila/ARCCS

(30/01/2026) Video to the First Web Demo of the LLM-based component (ARCCS) is now available online! Link Video

(31/01/2026) Technical Report of the LLM-based component (ARCCS) is now available online! PDF

Project Description

Public administration processes are highly complex due to multiple constraints, frequent deviations from legislators’ “happy paths,” and extensive compliance requirements. This project leverages Data Science and Artificial Intelligence (AI) — notably Process Mining (PM) and Large Language Models (LLMs) — to manage and optimize such complexity. Process Mining integrates business process management and data science, using event logs to analyze performance and detect bottlenecks. LLMs, trained on vast text corpora, can interpret and structure unstructured documents, making them ideal for extracting knowledge from legal and technical texts.

Public administration faces major compliance challenges when aligning process specification documents (e.g., enterprise architecture files) with legal guidelines often written in natural language. Traditional PM focuses mainly on process activities, neglecting data interactions. Recent multi-perspective PM research captures these aspects but still struggles with unstructured data.

This project proposes a general-purpose, log-based compliance method, which consists of aligning enriched specification documents with normative guidelines, inspired by symbolic techniques in data-aware declarative PM. LLMs are used to preprocess textual documents, converting them into structured symbolic representations, in order to enable symbolic alignment. The approach involves (i) defining a symbolic formalism for expressing process constraints, (ii) extracting and structuring legal and specification data using LLMs, and (iii) computing alignment scores that quantify compliance.

A pilot on ICT governance in Portuguese public entities will assess the alignment between enterprise architecture specifications and national ICT guidelines (Decree-Law 107/2012). In collaboration with ARTE and IGFEJ public administrations, this initiative aims to enhance compliance, improve efficiency, and support better public service delivery.

General Goal: To align contractual specification documents of PA processes (e.g., document defining technological requirements) with legal guidelines

Methodology

The project is structured in 4 phases:

Symbolic Formalization of Guidelines and Specifications.
- design a symbolic schema for both contractual specification documents and guidelines.
- identify logical formalisms (temporal reasoning + conditions over data): fragments of First-Order Linear Temporal Logic (FO-LTL).
Information Extraction via LLMs and RAG.
- develop a knowledge extraction pipeline based on LLMs, based on RAG and CoT.
- transform diverse textual inputs into the structured symbolic schema.
- RAG to dynamically access relevant context at inference time.
- CoT (with uncertainty and factuality-checking techniques) to address semantic fidelity integrate extracted information from event logs
Symbolic Alignment via Formal Methods.
- develop an alignment algorithm for: structured documents against regulatory schemas.
- symbolic reasoning to compute a compliance score quantifying the degree of alignment
- identify specific discrepancies and their types.
Validation and Evaluation.
- testing and validating the methodology using real data from partner PA entities (AMA and IGFEJ).
- Goal: (i) measure the precision and recall of LLM-based extraction methods; (ii) evaluate the interpretability and reliability of the alignment scores (iii) test the system’s ability to handle complex, heterogeneous document types.

Results

Beyond the publications listed below, we report here the links to the two main technical reports of the project:

Report1 (PDF): this technical report presents ARCCS, the LLM-based component of the OptiGov tool chain. You can find all source code and datasets at: https://github.com/geofila/ARCCS
Report2 (PDF): this technical report presents the ACTL framework and tool applied to a Portugues public administration process. You find the source code at: https://github.com/AlessandroGianola/optigov-actl-check

The project contributions resulted also in the following 5 MSc theses and 1 PIC2 document:

José João Alves dos Santos Ferreira. Specifying and Testing Distributed Protocols with Action Temporal Logic. Master's Thesis, Instituto Superior Técnico. November 2025 [MSc1]
Lucas Fortunato Das Neves. Learning to Disambiguate Queries with Respect to Business Processes. Master's Thesis, Instituto Superior Técnico. November 2025 [MSc2]
Rodolfo Jorge Antunes Amorim. TokenCP – Non-Exchangeable Conformal Prediction for Uncertainty Quantification. Master's Thesis, Instituto Superior Técnico. January 2026 [MSc3]
David Afonso Prazeres da Cruz Valente. Enhancing Large Language Models with Dynamic Thinking Tokens. Master's Thesis, Instituto Superior Técnico. November 2025 [MSc4]
Andreia Silva Azevedo. Applying the Business Process Management Lifecycle to Public Sector Digital Platforms: A Case Study approach in the Portuguese public sector. Master's Thesis, Instituto Superior Técnico. October 2025 [MSc5]
José Menezes. Process-Aware Retrieval-Augmented Generation for Auditable Compliance Reasoning. PIC2 document, Instituto Superior Técnico. January 2026 [MSc6]

The following video presents a demo of the ARCCS web interface:

People

Alessandro Gianola

Principal Investigator - Local Coordinator at INESC-ID. Professor Auxiliar, INESC-ID/Instituto Superior Técnico, Universidade de Lisboa

email · Google Scholar · Website

Chrysoula Zerva

Local Coordinator at Instituto de Telecomunicações. Professor Auxiliar, Instituto de Telecomunicaçœes/Instituto Superior Técnico, Universidade de Lisboa

email · Google Scholar · Website

INESC-ID

Faculty members

André Vasconcelos

Professor Associado, INESC-ID/Instituto Superior Técnico, Universidade de Lisboa

email · Website

José Fragoso Santos

Professor Auxiliar, INESC-ID/Instituto Superior Técnico, Universidade de Lisboa

email · Website

João F. Ferreira

Professor Associado, INESC-ID/FEUP, Universidade do Porto

email · Website

José Borbinha

Professor Catedrático, INESC-ID/Instituto Superior Técnico, Universidade de Lisboa

email · Website