Distributed and highly available key value stores : analysis and variations

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/91176

Title:	Distributed and highly available key value stores : analysis and variations
Authors:	Mangion, Andy (2011)
Keywords:	Computer storage devices Virtual computer systems Computer software -- Verification
Issue Date:	2011
Citation:	Mangion, A. (2011). Distributed and highly available key value stores: analysis and variations
Abstract:	The amount of data stored digitally by 2007 was approximated at 255 exabytes [42]; by early 2010 this figure rose to 8 million petabytes and it was expected to go beyond 1.2 zettabytes before the end of 2010 [12]. This phenomenal growth in data has made traditional data stores unsuitable for applications dealing with large amount of information. By consequence, the demand for storage systems which could handle larger amounts of data in a more efficient and reliable way is huge. The cheap cost of high-speed networks and commodity hardware motivated work on the distribution of traditional data structures to achieve better scalability and performance while maintaining consistency and reliability. Hash tables have been around for many years and are popular because of their efficiency in data retrieval and their wide applicability. Many studies have proposed various strategies to implement distributed hash tables. In this dissertation we analyse various designs, starting by looking into the family of Scalable and Distributed Data Structures (SDDS) such as LH* [38] which was also implemented and tested. We also examine Cassandra [30], a highly-available distributed storage system, and discuss variations to enhance its performance. One of the limitations of Cassandra is that its lookup overheads are linear to the number of nodes. Thus, we take the approach used in Pastry [49], a peer-to-peer system, to give the cluster a hierarchical structure while still maintaining replication and consistency guarantees. In our evaluation, when compared to Cassandra our modifications performed similarly in clusters with small number of nodes. This is an important advantage over Cassandra because it shows that the added scalability is achieved at no significant drop in performance.
Description:	B.SC.(HONS)IT
URI:	https://www.um.edu.mt/library/oar/handle/123456789/91176
Appears in Collections:	Dissertations - FacICT - 2011

Files in This Item:

File	Description	Size	Format
B.SC.(HONS)ICT_Mangion_Andy_2011.PDF Restricted Access		4.87 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics