Credit card fraud detection with oversampling

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/76771

Title:	Credit card fraud detection with oversampling
Authors:	Demicoli, Julian (2020)
Keywords:	Credit card fraud Machine learning
Issue Date:	2020
Citation:	Demicoli, J. (2020). Credit card fraud detection with oversampling (Bachelor's dissertation).
Abstract:	Card-based payments are becoming increasingly the standard payment method by consumers. In fact, between 2017 and 2018, global expenditure attributed to card payments grew by 17.7% to $40.582 trillion. Indeed, many industries rely heavily on card-based payments as an efficient means of collecting money from customers, as is the case with e-commerce. As the number of annual transactions made through the various types of payment cards increases, losses due to fraud are equally on the rise, and are forecast to amount to $35.67 billion globally by 2023 [1]. Given the quantity of transactions that financial institutions are processing on a daily basis, and the substantial losses being incurred due to fraud, financial institutions must put in place fraud detection systems which are cost-effective, automated, offer high accuracy, and minimal human intervention. This problem has not gone unnoticed by researchers, and strides have been made in early fraud detection systems which make use of machine learning techniques. However, the imbalanced distribution between fraudulent (minority) and non-fraudulent (majority) is challenging for many traditional learning algorithms which are ill-suited to handle such large class imbalances [2]. In fact, models trained on these datasets tend to be biased towards the majority class, albeit still achieving high accuracy scores because the minority class is ultimately a small percentage of the dataset. In this final year project, a data-level approach is used to overcome class imbalance by incorporating oversampling techniques that make use of synthetic data. Within this domain, a technique known as Synthetic Minority Over-Sampling Technique (SMOTE) has been used extensively; however, the development of SMOTE has spawned many variants of this technique which have not been examined extensively. This project evaluates other popular variants of the original technique in conjunction with machine learning techniques that, according to literature, perform well in this domain; XGBoost, Random Forest, and Gaussian Naive Bayes. Finally, the project highlights which algorithm is most suitable for card fraud detection.
Description:	B.Sc. IT (Hons)(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/76771
Appears in Collections:	Dissertations - FacICT - 2020 Dissertations - FacICTCIS - 2020

Files in This Item:

File	Description	Size	Format
20BITCB003.pdf Restricted Access		3.02 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics