Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/115258
Title: Autonomous drone control with reinforcement learning
Authors: Parnis, Kian (2023)
Keywords: Drone aircraft -- Automatic control
Algorithms
Reinforcement learning
Issue Date: 2023
Citation: Parnis, K. (2023). Autonomous drone control with reinforcement learning (Bachelor's dissertation).
Abstract: This project aims to develop a system for autonomous drone control that focuses on the problem of drone obstacle avoidance. The successful development of such a system is crucial for ensuring the safe and efficient deployment of drones across various industries, including search and rescue operations, package delivery, and infrastructure inspections. The specific problem this thesis is addressing is developing a reinforcement learning‐based solution for unmanned aerial vehicles (UAVs) that enable drones to safely navigate through unmapped cluttered environments, including obstacles that are either static or moving. To achieve this, AirSim was used to simulate drone physics, and the UnReal engine was used to construct its simulated environment. As part of the RL approach, the project incorporated a depth sensor to capture environmental data of the drone’s surroundings. This data was utilized as the state input of the reinforcement learning algorithm to learn and make decisions about the surrounding environment. The agent was provided with various observations representing the state of the environment. The most significant observation was the depth imagery that was captured at every step using the drone’s depth sensor. This state was processed using a Convolutional Neural Network (CNN), which extracted and learned relevant features from these images. In addition to depth imagery, the agent also received information on its current velocity, its current distance from the goal, and a history of its previous actions. These actions are passed through an Artificial Neural Network (ANN) before being flattened and combined with the processed imagery to be fed to the agent. This framework was utilized to train four RL algorithms to navigate environments with static obstacles and train the best two RL algorithms on environments with dynamic obstacles. The two discrete models were trained using the Deep Q‐Network (DQN) and Double Deep Q‐Network (DDQN) algorithms, while the two continuous models were trained using the Proximal Policy Optimization (PPO) and Trust Region Policy Optimisation (TRPO) algorithms. This ultimately resulted in successful policies that could avoid obstacles and reach their destination in complex environments. The best result obtained was with the Double Deep Q‐Network algorithm which reached its target goals 93% of the time in an environment with static obstacles and an average of 84.5% target goals reached in an environment with dynamic obstacles.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/115258
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
2308ICTICT390900013342_1.PDF
  Restricted Access
5.2 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.