Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/100877
Title: Multi-view video-plus-depth coding using depth information
Authors: Micallef, Brian W. (2013)
Keywords: Geometry
3-D video (Three-dimensional imaging)
Video compression
Issue Date: 2013
Citation: Micallef, B. W. (2013). Multi-view video-plus-depth coding using depth information (Doctoral dissertation).
Abstract: Multi-view 3D videos are required to provide an enhanced immersive user experience, with the ability to perceive the 3D depth effect and arbitrarily select the desired navigation viewpoint. For an efficient representation, the texture multi-view video data require also the transmission of its aligned depth map multi-view counterpart to help reconstruct arbitrary intermediate virtual viewpoints. Encoding these components within these 3D videos require the Multi-view Video Coding (MVC) standard, which is a very computational intensive task. This process increases drastically the 3D video encoding durations, compared to 2D video encoding, which hinders its use for transmission, such as for real-time or broadcast applications, which respectively need the Low-latency and the Hierarchical bi-Prediction MVC structures. Nevertheless, this new aligned depth map data supplies a Jot of useful geometrical information about the scene, which data together with appropriate geometrical properties can be exploited to achieve multi-view video coding within a shorter coding latency and with higher coding efficiencies. Experimental results demonstrate that this geometrical information can be efficiently utilised to calculate more accurate geometric predictors for the disparity and the motion estimations. As these are geometrically assisted, they become more accurate than the median ones adopted by the standard, thus, they allow a reduction in the estimations' search area, and as a result, in their encoding durations. Such aids obtain an overall MVC speed-up gain of up-to 4.2 times when applied to the Low-latency MVC, and of 3.2 times when applied to the Hierarchical bi-Prediction MVC. Exploiting further inter-view geometric relationships within the multi-view video reduce further the disparity estimation's search range, and improve its final speed-up gain to up-to 20.8 times. Moreover, the motion and the disparity estimations can be also modified to exploit this depth map data to limit their actual mode tested during rate-distortion optimisation. This is because this geometry helps identify better equivalent positions in the encoded viewpoints, and exploit them to determine the potential modes to use for motion estimation. Additionally, the most likely macroblock' s best division for disparity estimation can also be determined and exploited. The former improves the MVC's speed-up gain to about 6 times, while the latter reduces the disparity estimation's time by 26.0 times. All of these fast techniques are Pagev then combined together to form a joint computational reduction in MVC, which improved its coding speed up to 6.7 times for low-latency applications, and up to 3.7 times for broadcast applications. Being more accurate, the geometric predictors allow for smaller residual vector encoding, providing around 8 % bit-rate reduction. Additionally, the SKIP mode can also be extended to select automatically the SKIP mode's compensation vector and direction, from either a temporal or a new viewpoint Reference frame. The latter two techniques provide a joint bit-rate reduction of about 14 % for Low-latency MVC, and of about 13 % for Hierarchical bi-Prediction MVC, which is equivalent to an encoded inter-view quality improvement of about 0.6 dB. By using these two techniques together, a trade-off between a fast and an efficient encoding technique can be achieved. These improvements in coding time and efficiencies were obtained with a negligible to acceptably-small degradation in the decoded video quality, for both the texture and the depth map multi-view videos, while the rendering capability and quality of the compressed 3D videos are almost sustained. Hence, exploiting these improvements allows 3D video coding to adhere better to the stringent low-latency requirements needed for real-time and live broadcast scenarios, while making the encoded bit-stream more adequate for limited bandwidth channels.
Description: PH.D.
URI: https://www.um.edu.mt/library/oar/handle/123456789/100877
Appears in Collections:Dissertations - FacICT - 2013
Dissertations - FacICTCCE - 1999-2013

Files in This Item:
File Description SizeFormat 
PH.D._Micallef Brian W._2013.pdf
  Restricted Access
39.67 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.