Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/108245| Title: | Exploring the automatic detection of bias and stance in media houses on controversial topics |
| Authors: | Balzan, Janice (2022) |
| Keywords: | News Web sites Mass media -- Objectivity Natural language processing (Computer science) |
| Issue Date: | 2022 |
| Citation: | Balzan, J. (2022). Exploring the automatic detection of bias and stance in media houses on controversial topics (Master's dissertation). |
| Abstract: | Bias detection is increasingly becoming more important in the area of Natural Language Processing. It plays a major role in improving the performance of systems like recommendation systems and language generator systems. Moreover, detecting bias within text that will be made available to the public and finding effective ways to remove it, could help in reducing the bias which is present in society. Stance, on the other hand, is more considered as an opinion towards a particular topic. It is widely believed that one should expose stance so that the reader would be informed that the information conveyed in the text might be skewed to one side of a discussion. By doing so, the reader might decide to analyse the stance of other articles with the aim of either finding one with the opposite stance or of a neutral one. This would enable them to get a more holistic view of the topic in discussion. The aim of this study is to, firstly, produce a dataset which is used to analyse bias and stance in news articles about controversial topics. Secondly, this study proposes two methods which could effectively expose bias within long text, in this case the news articles collected. These methods are namely Topic Neighbours Extraction and Masked Word Prediction. Qualitative analysis of the words outputted from these words was done and a number of interesting outcomes were found, which indicate that bias was present in the pre-trained models, and in turn in the text used to pre-train these models. Additionally, another method is proposed which is able to use a dataset of news articles about a particular topic and a readily available labelled dataset of tweets about the same topic and their respective stance, in order to build a model which can classify the stance of news articles. The highest performance scores achieved for this implementation was an accuracy of 0.48 and a macro F1-Score of 0.26. This was done by using a novel method of assigning a stance to each sentence in the article using a pre-trained and fine-tuned models and then applying an aggregation method, called the Weighted Summation Method, for the final overall stance of the article. With the creation of this dataset and the implementation of these methods, more information is gathered about these linguistic features and their possible detection and mitigation. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/108245 |
| Appears in Collections: | Dissertations - FacICT - 2022 Dissertations - FacICTAI - 2022 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2219ICTICS520000007202_1.PDF Restricted Access | 2.69 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
