Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/91859
Title: Video inpainting using GANs
Authors: Mallia, Daniel (2021)
Keywords: Image reconstruction
Neural networks (Computer science)
Inpainting
Machine learning
Issue Date: 2021
Citation: Mallia, D. (2021). Video inpainting using GANs (Master’s dissertation).
Abstract: The field of video editing is encompassed as the manipulation of video frames. It is necessary in the area of film making to adjust features and manipulate certain objects in a scene. Branching from video editing, inpainting is the process of filling in missing regions in a video. This process is not trivial in videos as the spatio-temporal coherence must be maintained throughout the frames while new information is added. This dissertation proposes a solution for video inpainting, utilising Generative Adversarial Networks (GANs). The spatial coherence is when the inpainted region is consistent with the whole frame, while temporal coherence refers to the inpainted frames being congruous with the previous and following inpainted frames. In this dissertation the task of video inpainting is split into image inpainting and optical flow generation. The image inpainting section can be utilised on its own to acquire inpainted images using a GAN which was trained using the Imagenet Dataset. It can also be applied to masked video frames to obtain the inpainted video frame results. The generated optical flow is then merged with these inpainted frames by going through the pixels around as well as those inside the inpainted regions of the previous and next frames and warping them with the optical flow. This measures whether they will make part of the inpainted region of the currently inpainted frame. The inpainted region part which is not filled in with this part of the video inpainting algorithm is filled in using the previously inpainted frame. The image inpainting and video inpainting results were evaluated using quantitative metrics, that include saliency and non-saliency based metrics, as well as a user study for qualitative metrics. The average PSNR obtained for the object removal video inpainted results is 71.06dB while the average SSIM is 0.9996. Both display good results with regards to lack of noise in the image as well as high structural similarity. Additionally the user study results were slightly above an average score. Based on the evaluations, the results are promising but future work will for sure increase the level of performance.
Description: M.Sc.(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/91859
Appears in Collections:Dissertations - FacICT - 2021
Dissertations - FacICTAI - 2021

Files in This Item:
File Description SizeFormat 
21MAIPT015.pdf
  Restricted Access
8 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.