The myth of reproducibility : a review of event tracking evaluations on Twitter

Mamo, Nicholas; Azzopardi, Joel; Layfield, Colin

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/113505

Full metadata record

DC Field	Value	Language
dc.contributor.author	Mamo, Nicholas	-
dc.contributor.author	Azzopardi, Joel	-
dc.contributor.author	Layfield, Colin	-
dc.date.accessioned	2023-10-04T13:11:12Z	-
dc.date.available	2023-10-04T13:11:12Z	-
dc.date.issued	2023-04-05	-
dc.identifier.citation	Mamo, N., Azzopardi, J. & Layfield, C. (2023). The myth of reproducibility: A review of event tracking evaluations on Twitter. Frontiers in Big Data, 6, 1067335.	en_GB
dc.identifier.uri	https://www.um.edu.mt/library/oar/handle/123456789/113505	-
dc.description.abstract	Event tracking literature based on Twitter does not have a state-of-the-art. What it does have is a plethora of manual evaluation methodologies and inventive automatic alternatives: incomparable and irreproducible studies incongruous with the idea of a state-of-the-art. Many researchers blame Twitter's data sharing policy for the lack of common datasets and a universal ground truth–for the lack of reproducibility–but many other issues stem from the conscious decisions of those same researchers. In this paper, we present the most comprehensive review yet on event tracking literature's evaluations on Twitter. We explore the challenges of manual experiments, the insufficiencies of automatic analyses and the misguided notions on reproducibility. Crucially, we discredit the widely-held belief that reusing tweet datasets could induce reproducibility. We reveal how tweet datasets self-sanitize over time; how spam and noise become unavailable at much higher rates than legitimate content, rendering downloaded datasets incomparable with the original. Nevertheless, we argue that Twitter's policy can be a hindrance without being an insurmountable barrier, and propose how the research community can make its evaluations more reproducible. A state-of-the-art remains attainable for event tracking research.	en_GB
dc.language.iso	en	en_GB
dc.publisher	Frontiers Research Foundation	en_GB
dc.rights	info:eu-repo/semantics/openAccess	en_GB
dc.subject	Artificial intelligence	en_GB
dc.subject	Twitterbots	en_GB
dc.subject	Twitter	en_GB
dc.subject	Machine learning	en_GB
dc.subject	Data collection platforms	en_GB
dc.subject	Information retrieval	en_GB
dc.title	The myth of reproducibility : a review of event tracking evaluations on Twitter	en_GB
dc.type	article	en_GB
dc.rights.holder	The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.	en_GB
dc.description.reviewed	peer-reviewed	en_GB
dc.identifier.doi	10.3389/fdata.2023.1067335	-
dc.publication.title	Frontiers in Big Data	en_GB
Appears in Collections:	Scholarly Works - FacICTAI

Files in This Item:

File	Description	Size	Format
Frontiers_Mamo2023.pdf		360.94 kB	Adobe PDF	View/Open

Show simple item record Statistics