Experiments with document retrieval from small text collections using latent semantic analysis or term similarity with query coordination and automatic relevance feedback

Layfield, Colin; Azzopardi, Joel; Staff, Chris

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/103266

Full metadata record

DC Field	Value	Language
dc.contributor.author	Layfield, Colin	-
dc.contributor.author	Azzopardi, Joel	-
dc.contributor.author	Staff, Chris	-
dc.date.accessioned	2022-10-31T16:42:26Z	-
dc.date.available	2022-10-31T16:42:26Z	-
dc.date.issued	2017	-
dc.identifier.citation	Layfield, C., Azzopardi, J., & Staff, C. (2017). Experiments with document retrieval from small text collections using latent semantic analysis or term similarity with query coordination and automatic relevance feedback. In A. Calì, D. Gorgan, & M. Ugarte (Eds.), Semantic Keyword-Based Search on Structured Data Sources. IKC 2016. Lecture Notes in Computer Science, vol 10151. (pp. 25-36). Cham: Springer.	en_GB
dc.identifier.uri	https://www.um.edu.mt/library/oar/handle/123456789/103266	-
dc.description.abstract	Users face the Vocabulary Gap problem when attempting to retrieve relevant textual documents from small databases, especially when there are only a small number of relevant documents, as it is likely that diﬀerent terms are used in queries and relevant documents to describe the same concept. To enable comparison of results of diﬀerent approaches to semantic search in small textual databases, the PIKES team constructed an annotated test collection and Gold Standard comprising 35 search queries and 331 articles. We present two diﬀerent possible solutions. In one, we index an unannotated version of the PIKES collection using Latent Semantic Analysis (LSA) retrieving relevant documents using a combination of query coordination and automatic relevance feedback. Although we outperform prior work, this approach is dependent on the underlying collection, and is not necessarily scalable. In the second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). Queries are automatically expanded with related terms from the TSM and are submitted to a term-by-document matrix Vector Space Model of the PIKES collection. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.	en_GB
dc.language.iso	en	en_GB
dc.publisher	Springer International Publishing AG	en_GB
dc.rights	info:eu-repo/semantics/restrictedAccess	en_GB
dc.subject	Log-linear models -- Computer programs	en_GB
dc.subject	Semantics	en_GB
dc.subject	Information retrieval	en_GB
dc.subject	Latent semantic indexing	en_GB
dc.title	Experiments with document retrieval from small text collections using latent semantic analysis or term similarity with query coordination and automatic relevance feedback	en_GB
dc.title.alternative	Semantic keyword-based search on structured data sources. IKC 2016. Lecture notes in computer science	en_GB
dc.type	bookPart	en_GB
dc.rights.holder	The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.	en_GB
dc.description.reviewed	peer-reviewed	en_GB
dc.identifier.doi	10.1007/978-3-319-53640-8 3	-
Appears in Collections:	Scholarly Works - FacICTAI

Files in This Item:

File	Description	Size	Format
Experiments_with_document_retrieval_from_small_text_collections_using_latent_semantic_analysis_or_term_similarity_with_query_coordination_and_automatic_relevance_feedback_2017.pdf Restricted Access		194.12 kB	Adobe PDF	View/Open Request a copy

Show simple item record Statistics