Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/100465
Title: Graphics processing unit implementation and optimisation of a flexible maximum a-posteriori decoder for synchronisation correction
Authors: Briffa, Johann A.
Keywords: Graphics processing units
Synchronization
Error-correcting codes (Information theory)
Issue Date: 2014
Publisher: IET
Citation: Briffa, J. A. (2014). Graphics processing unit implementation and optimisation of a flexible maximum a‐posteriori decoder for synchronisation correction. The Journal of Engineering, 2014(6), 284-296.
Abstract: The problem of correcting synchronisation errors has recently seen an increase in interest [1]. We believe this is because of two factors: recent applications for such codes, where traditional techniques for synchronisation cannot be applied, and the feasibility of decoding because of improvements in computing resources. A recent application is for bit-patterned media [2, 3], where written-in errors can be modelled as synchronisation errors. Bit-patterned media is of great interest to the magnetic recording industry because of the potential increase in writing density. Another example is robust digital watermarking, where a message is embedded into a media file and an attacker seeks to make the message unreadable. An effective attack is to cause loss of synchronisation; synchronisation-correcting codes have been successfully applied to resist these attacks in speech [4] and image [5] watermarking. Most practical decoders for synchronisation correction work by extending the state space of the underlying code to account for the state of the channel (which represents the synchronisation error). This increases the decoding complexity significantly, particularly under poor channel conditions where the state space is necessarily larger. Although optimal decoding is achievable, the complexity involved remains a barrier for wider adoption. The problem is even more pronounced when these codes are part of an iteratively decoded construction. A key practical synchronisation-correcting scheme is the concatenated construction by Davey and MacKay [6], where the inner code tracks synchronisation on an unbounded random insertion and deletion channel. We presented a maximum a-posteriori (MAP) decoder for a generalised construction of the inner code in [7] and improved encodings in [8]. In [9], we presented a parallel implementation of our maximum a-posteriori (MAP) decoder on a graphics processing unit (GPU) using NVIDIA’s Compute Unified Device Architecture (CUDA) [10]. This resulted in a decoding speedup of up to two orders of magnitude, depending on code parameters and channel conditions. Since that work we have also presented a number of additional improvements to the MAP decoder algorithm [11], resulting in a speedup of over an order of magnitude in a serial implementation, as we shall show. Unfortunately, these algorithmic improvements change the proportion of time spent computing the various equations, so that a straightforward application of the algorithm improvements to our earlier GPU implementation does not yield the expected speedup. A more careful parallelisation strategy is required, which we discuss in this paper.
URI: https://www.um.edu.mt/library/oar/handle/123456789/100465
Appears in Collections:Scholarly Works - FacICTCCE



Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.