Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/92814
Title: Fast PageRank approximation using CUDA
Authors: Ritchie, Shawn (2010)
Keywords: Iterative methods (Mathematics)
Search engines
Vector analysis
Issue Date: 2010
Citation: Ritchie, S. (2010). Fast PageRank approximation using CUDA (Bachelor's dissertation).
Abstract: The internet is one of the fastest growing technologies of this century, but the internet success couldn't be possible without the use of search engine which try and structure the anarchy found on the world wide web, making searches possible. In this thesis we will be discussing an individual module of a search engine, the PageRank algorithm which is an unbiased score given to webpages based on the web's hyperlink structure. Our main focus will be on speeding up the PageRank process using the Power Method, a renowned algorithm used for solving the Eigen vector problem. In order to speed up the computation involved in the Power method we will be making use of GPU computing on the Nvidia architecture using CUDA. A detailed view of the CUDA architecture will also be given in this dissertation report. The main problems tackled in this project are; - The mathematics behind the PageRank algorithm and a detailed explanation on the Power Method and the Random surfer model. - Sparse Matrix Representation - A sparse matrix representation was required in order to use CUDA resources efficiently the different method discussed where ELLPACK and TJDS. A hybrid format between these 2 method was formed. - Sparse Matrix Multiplication - methods, where developed, working on different architecture. - Single threaded PageRank implementation working on the CPU architecture. - A CUDA PageRank implementation working on a single GPU - A PageRank algorithm was implemented which works on a single GPU making use of the Sparse matrix multiplication. - BlockRank Algorithm - To further speedup the Eigen vector problem an aggregation method was used, developed by (T. H. Sepandar D. Kamvar n.d.) which makes use of multiple Nvidia devices in parallel. We will then give a detailed evaluation of the different implementation. We will first off starts discussing the different speedups achieved by tweaking the a-factor and the error threshold in the Power method. Once the optimal variable choices, have been found we will move on to comparing the different PageRank algorithms developed running on different data sources. This research should aid search engines apply the PageRank algorithm on a more frequent basis, giving more accurate results to their search queries. Another area where this research might be considered handy, is the development of Personalized PageRank algorithm which basically biases the PageRank vector to ones likes, but in order to do so, the Personalized PageRank algorithm needs to processed several times using different Personalization Vectors. Seeing that the PageRank algorithm is a computational intensive process speeding up this algorithm, might end up making it a viable ranking system in a search engines.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/92814
Appears in Collections:Dissertations - FacICT - 2010
Dissertations - FacICTCS - 2010-2015

Files in This Item:
File Description SizeFormat 
B.SC.IT(HONS)_Ritchie_Shawn_2010.PDF
  Restricted Access
34.89 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.