Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/115257
Title: Combining visual and temporal information in graph neural networks to solve visual SLAM
Authors: Suban, Celine (2023)
Keywords: Neural networks (Computer science)
Mappings (Mathematics)
Deep learning (Machine learning)
Mobile robots
Issue Date: 2023
Citation: Suban, C. (2023). Combining visual and temporal information in graph neural networks to solve visual SLAM (Bachelor's dissertation).
Abstract: The Visual Place Recognition (VPR) problem is when a mobile robot uses input images to recognise if its current location has been visited previously. This is a crucial aspect of visual Simultaneous Localization And Mapping (vSLAM) as it is needed to create an accurate, globally consistent map of the environment the robot is navigating. Historically this was done through hand-crafted features, but these were not robust against changing environmental conditions, which is one of the challenges of VPR. Recently, Convolutional Neural Network (CNN)s and Deep Learning methods have shown better results and increased robustness. Another trend in vSLAM and VPR is to fuse other sources of information with the visual information obtained from the image. In this dissertation temporal information will be combined with the images’ Regional Maximum Activation of Convolutions (R-MAC) features, obtained from a pre-trained CNN, in order to construct a graph. This graph will then be used as input to various Graph Neural Network (GNN)s, such as Graph Convolutional Network (GCN) and Graph Attention Network (GAT), in order to determine if the image pair represent the same location despite appearance and/or viewpoint changes. The implemented GNNs will be trained and tested on popular publicly available datasets, namely the New College, City Centre and Gardens Point datasets. Experiments are then conducted on the trained GNNs, in order to be able to better evaluate their performance. Experiments include changing the number of layers in the GNN and changing how many images are passed in sequence to the GNN, to add temporal information.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/115257
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
2308ICTICT390900013405_1.PDF
  Restricted Access
2.44 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.