DFA learning using SAT solvers

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/107909

Title:	DFA learning using SAT solvers
Authors:	Formosa, Logan (2022)
Keywords:	Sequential machine theory Heuristic algorithms Algebra, Boolean
Issue Date:	2022
Citation:	Formosa, L. (2022). DFA learning using SAT solvers (Bachelor's dissertation).
Abstract:	Regular inference is the task of inferring a deterministic finite-state automaton (DFA) from a training set of positive and negative strings which, respectively, belong and do not belong to a regular language. Additionally, the regular inference task is usually formulated as finding the minimumstate DFA that is consistent with the training data; this problem is known to be NP-complete and is one of the more heavily studied areas in the broader field of grammatical inference [1]. One of the most successful approaches are so called state merging algorithms where a highly specific hypothesis called a prefix tree acceptor (PTA) is created from the training data and pairs of states are iteratively selected and merged to compact and generalise the hypothesis. Another interesting approach reduces this problem to Boolean satisfiability (SAT). DFA learning is first translated to graph colouring and then to SAT, allowing a SAT solver to infer a hypothesis from a training set of positive and negative strings. The number of clauses written for large problems prove to be too much for certain SAT solvers to handle, and therefore the APTA is first pre-processed using a pre-existing state merging algorithm such as EDSM to first obtain a partially identified DFA and reduce the size of the problem. In this FYP, we study the DFASAT algorithm and perform a comparative analysis with current state-of-the-art state-merging algorithms such as EDSM, windowed-EDSM, and blue-fringe. Different experiments of 512 problem instances of Abbadingo and StaMinA-style DFAs were set up. Results show that DFASAT outperforms other algorithms for 16-state problems with a binary alphabet and can infer the target DFA at a higher rate. DFASAT is also able to infer multiple DFAs with the same clauses through different truth assignments. It was also found to be very reliant on the pre-processing performed. We propose two new approaches for possible improvement were proposed. The first approach makes use of DFASAT’s ability to find multiple non-isomorphic DFAs and combine them into a single ensemble that accepts and rejects strings under a voting scheme. The second proposed approach identifies other algorithms which have been shown to work very well for sparse training data, and aim at produce high quality initial merges. These can also be extended with the previous approach.
Description:	B.Sc. IT (Hons)(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/107909
Appears in Collections:	Dissertations - FacICT - 2022 Dissertations - FacICTAI - 2022

Files in This Item:

File	Description	Size	Format
2208ICTICT390905069315_1.PDF Restricted Access		1.14 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics