Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/107909
Title: | DFA learning using SAT solvers |
Authors: | Formosa, Logan (2022) |
Keywords: | Sequential machine theory Heuristic algorithms Algebra, Boolean |
Issue Date: | 2022 |
Citation: | Formosa, L. (2022). DFA learning using SAT solvers (Bachelor's dissertation). |
Abstract: | Regular inference is the task of inferring a deterministic finite-state automaton (DFA) from a training set of positive and negative strings which, respectively, belong and do not belong to a regular language. Additionally, the regular inference task is usually formulated as finding the minimumstate DFA that is consistent with the training data; this problem is known to be NP-complete and is one of the more heavily studied areas in the broader field of grammatical inference [1]. One of the most successful approaches are so called state merging algorithms where a highly specific hypothesis called a prefix tree acceptor (PTA) is created from the training data and pairs of states are iteratively selected and merged to compact and generalise the hypothesis. Another interesting approach reduces this problem to Boolean satisfiability (SAT). DFA learning is first translated to graph colouring and then to SAT, allowing a SAT solver to infer a hypothesis from a training set of positive and negative strings. The number of clauses written for large problems prove to be too much for certain SAT solvers to handle, and therefore the APTA is first pre-processed using a pre-existing state merging algorithm such as EDSM to first obtain a partially identified DFA and reduce the size of the problem. In this FYP, we study the DFASAT algorithm and perform a comparative analysis with current state-of-the-art state-merging algorithms such as EDSM, windowed-EDSM, and blue-fringe. Different experiments of 512 problem instances of Abbadingo and StaMinA-style DFAs were set up. Results show that DFASAT outperforms other algorithms for 16-state problems with a binary alphabet and can infer the target DFA at a higher rate. DFASAT is also able to infer multiple DFAs with the same clauses through different truth assignments. It was also found to be very reliant on the pre-processing performed. We propose two new approaches for possible improvement were proposed. The first approach makes use of DFASAT’s ability to find multiple non-isomorphic DFAs and combine them into a single ensemble that accepts and rejects strings under a voting scheme. The second proposed approach identifies other algorithms which have been shown to work very well for sparse training data, and aim at produce high quality initial merges. These can also be extended with the previous approach. |
Description: | B.Sc. IT (Hons)(Melit.) |
URI: | https://www.um.edu.mt/library/oar/handle/123456789/107909 |
Appears in Collections: | Dissertations - FacICT - 2022 Dissertations - FacICTAI - 2022 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2208ICTICT390905069315_1.PDF Restricted Access | 1.14 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.