Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/91176
Title: Distributed and highly available key value stores : analysis and variations
Authors: Mangion, Andy (2011)
Keywords: Computer storage devices
Virtual computer systems
Computer software -- Verification
Issue Date: 2011
Citation: Mangion, A. (2011). Distributed and highly available key value stores: analysis and variations
Abstract: The amount of data stored digitally by 2007 was approximated at 255 exabytes [42]; by early 2010 this figure rose to 8 million petabytes and it was expected to go beyond 1.2 zettabytes before the end of 2010 [12]. This phenomenal growth in data has made traditional data stores unsuitable for applications dealing with large amount of information. By consequence, the demand for storage systems which could handle larger amounts of data in a more efficient and reliable way is huge. The cheap cost of high-speed networks and commodity hardware motivated work on the distribution of traditional data structures to achieve better scalability and performance while maintaining consistency and reliability. Hash tables have been around for many years and are popular because of their efficiency in data retrieval and their wide applicability. Many studies have proposed various strategies to implement distributed hash tables. In this dissertation we analyse various designs, starting by looking into the family of Scalable and Distributed Data Structures (SDDS) such as LH* [38] which was also implemented and tested. We also examine Cassandra [30], a highly-available distributed storage system, and discuss variations to enhance its performance. One of the limitations of Cassandra is that its lookup overheads are linear to the number of nodes. Thus, we take the approach used in Pastry [49], a peer-to-peer system, to give the cluster a hierarchical structure while still maintaining replication and consistency guarantees. In our evaluation, when compared to Cassandra our modifications performed similarly in clusters with small number of nodes. This is an important advantage over Cassandra because it shows that the added scalability is achieved at no significant drop in performance.
Description: B.SC.(HONS)IT
URI: https://www.um.edu.mt/library/oar/handle/123456789/91176
Appears in Collections:Dissertations - FacICT - 2011

Files in This Item:
File Description SizeFormat 
B.SC.(HONS)ICT_Mangion_Andy_2011.PDF
  Restricted Access
4.87 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.