Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/91891
Full metadata record
DC FieldValueLanguage
dc.date.accessioned2022-03-21T11:36:07Z-
dc.date.available2022-03-21T11:36:07Z-
dc.date.issued2021-
dc.identifier.citationMifsud, P. (2021). Churn prediction of telecommunications prepaid customers (Master’s dissertation).en_GB
dc.identifier.urihttps://www.um.edu.mt/library/oar/handle/123456789/91891-
dc.descriptionM.Sc.(Melit.)en_GB
dc.description.abstractThe telecom domain contains large volumes of data that can be used to implement machine-learning solutions for the benefit of the company. Customer churn is when a client stops the subscription and use of services of a company and this can cause loss in revenue. Telecom companies tend to have higher churning customers than other industries due to high competition and the ease of changing service providers. Churn prediction is the problem studied and presented in this research by including data exploration and extraction to build a churn prediction system that can produce good predictive results. The anonymised data was provided by a major telecoms company based in Malta, and was compiled from different sources over the course of 1 year. It was handled and processed using distributed processing with Spark, which offered the ability to ingest large volumes of data and implement a feature engineering stage, thus creating additional informative features. Data cleansing stages were used to handle the inconsistencies and issues found in telecom data. Feature selection methods such as Pearson Correlation, Chi-Square test, and tree-based models were used to extract the most informative features. Sampling methods dealt with the highclass imbalance inherent in churn prediction data, with Random Under Sampling producing the best results. Several models were implemented, and the best performing system was found by conducting a number of experiments. The use of temporal features with pre-processing stages and state-of-the-art models such as LightGBM and XGBoost outperformed the other models, with an AUC score of 0.9922 and AUC-PR of 0.9315 for the first model, and an AUC of 0.9894 and AUC-PR of 0.9219 for the second model. Distributed computing allowed the needed scalability and processing power to build this system and execute the planned experiments. The investment of good pre-processing stages and data-mining techniques provided a solution to deal with the prediction of high-risk customers, enabling the company to decrease the churn prediction rate by up to 2%.en_GB
dc.language.isoenen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectData mining -- Maltaen_GB
dc.subjectTelecommunication -- Maltaen_GB
dc.subjectTelecommunication -- Marketingen_GB
dc.subjectCustomer services -- Managementen_GB
dc.subjectMachine learningen_GB
dc.subjectBig data -- Maltaen_GB
dc.titleChurn prediction of telecommunications prepaid customersen_GB
dc.typemasterThesisen_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.publisher.institutionUniversity of Maltaen_GB
dc.publisher.departmentFaculty of ICT. Department of Artificial Intelligenceen_GB
dc.description.reviewedN/Aen_GB
dc.contributor.creatorMifsud, Philip (2021)-
Appears in Collections:Dissertations - FacICT - 2021
Dissertations - FacICTAI - 2021

Files in This Item:
File Description SizeFormat 
21MAIPT017.pdf
  Restricted Access
2.3 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.