Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/39499
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.date.accessioned | 2019-02-05T10:09:17Z | - |
dc.date.available | 2019-02-05T10:09:17Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Bugeja, M. (2018). Automatic analysis of handwritten text (Master's dissertation). | en_GB |
dc.identifier.uri | https://www.um.edu.mt/library/oar//handle/123456789/39499 | - |
dc.description | M.SC.ARTIFICIAL INTELLIGENCE | en_GB |
dc.description.abstract | Analysing Handwritten Documents is a challenging task. This particular area cannot always come up with general solutions, given that most handwritten data sets contain unique characteristics that describe how the document was written which include different handwritings. This is mostly attributed to multiple scribes contributing to the transcription of the text and degradation of the script. In this study, a unique dataset is presented which up to now has never been read or analysed. The aim is to be able to come up with an adaptive system which is able to tackle the two different challenges. These challenges are to apply document text segmentation and conversion to machine readable text to the unique dataset used in the study. This study goes through the process of converting a document image into a set of segmented components describing the lowest level of denomination needed to transform the document into ASCII characters. The novel approach used in this dissertation is able to convert the document image without any prior knowledge of the text. In fact, the training set used in this study is a synthetic dataset built on the Google Fonts database. This approach segments the document into lines, words and finally characters using a number of unique approaches that were adapted from the literature. Notably the line segmentation and the character segmentation yielded positive results with the line segmentation achieving an overall segmentation accuracy of 92.81%. The final text recognition process is built on machine learning models and Deep Neural Nets using a Multilayer Perceptron architecture. A unique training set was created to try and classify handwritten text without the use of a subset of manual labelled characters from the final testing dataset. | en_GB |
dc.language.iso | en | en_GB |
dc.rights | info:eu-repo/semantics/restrictedAccess | en_GB |
dc.subject | ASCII (Character set) | en_GB |
dc.subject | Perceptrons | en_GB |
dc.subject | Neural networks (Computer science) | en_GB |
dc.title | Automatic analysis of handwritten text | en_GB |
dc.type | masterThesis | en_GB |
dc.rights.holder | The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder. | en_GB |
dc.publisher.institution | University of Malta | en_GB |
dc.publisher.department | Faculty of Information and Communication Technology. Department of Artificial Intelligence | en_GB |
dc.description.reviewed | N/A | en_GB |
dc.contributor.creator | Bugeja, Mark | - |
Appears in Collections: | Dissertations - FacICT - 2018 Dissertations - FacICTAI - 2018 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
18MAIPT04.pdf Restricted Access | 3.51 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.