Mostrar el registro sencillo del ítem

dc.contributor.author
Maisonnave, Mariano  
dc.contributor.author
Delbianco, Fernando Andrés  
dc.contributor.author
Tohmé, Fernando Abel  
dc.contributor.author
Maguitman, Ana Gabriela  
dc.date.available
2021-07-01T21:42:52Z  
dc.date.issued
2021-05  
dc.identifier.citation
Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Maguitman, Ana Gabriela; Assessing the behavior and performance of a supervised term-weighting technique for topic-based retrieval; Elsevier Science; Information Processing & Management; 58; 3; 5-2021; 1-17; 102483  
dc.identifier.issn
0306-4573  
dc.identifier.uri
http://hdl.handle.net/11336/135329  
dc.description.abstract
Topic-based retrieval is the task of seeking and retrieving material related to a topic of interest. This task involves two subtasks: selecting query terms and ranking the retrieved results. Supervised approaches to assess the importance of a term in a topic or class have demonstrated to be effective for guiding the query-term selection subtask. This article analyzes and evaluates FDD, a supervised term-weighting scheme that can be applied for query-term selection in topic-based retrieval. FDD weights terms based on two factors representing the descriptive and discriminating power of the terms with respect to the given topic. It then combines these two factor through the use of an adjustable parameter that allows to favor different aspects of retrieval, such as precision, recall or a balance between both. Previous preliminary studies have demonstrated the potential of FDD to identify useful query terms. However, preceding studies have limited the analysis to a single domain represented by a single data set with binary categories and have not compared FDD to other recently formulated term-weighting techniques. The contributions of this article are the following: (1) it presents an extensive analysis of the behavior of FDD as a function of its adjustable parameter; (2) it compares FDD against eighteen traditional and state-of-the-art weighting scheme; (3) it evaluates the performance of disjunctive queries built by combining terms selected using the analyzed methods; (4) it makes a full data set and the full code publicly available to replicate the reported analysis and foster future research in the area. The analysis and evaluations are performed on three data sets: two well-known text data sets, namely 20 Newsgroups and Reuters-21578, and the newly released data set. It is possible to conclude that despite its simplicity, FDD is competitive with state-of-the-art methods and has the important advantage of offering flexibility at the moment of adapting to specific task goals. The results also demonstrate that FDD offers a useful mechanism to explore different approaches to build complex queries.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Elsevier Science  
dc.rights
info:eu-repo/semantics/restrictedAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/  
dc.subject
TERM WEIGHTING  
dc.subject
VARIABLE EXTRATION  
dc.subject
INFORMATION RETRIEVAL  
dc.subject
QUERY-TERM SELECTION  
dc.subject
TOPIC-BASED RETRIEVAL  
dc.subject.classification
Ciencias de la Computación  
dc.subject.classification
Ciencias de la Computación e Información  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
Assessing the behavior and performance of a supervised term-weighting technique for topic-based retrieval  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2021-06-10T19:26:08Z  
dc.journal.volume
58  
dc.journal.number
3  
dc.journal.pagination
1-17; 102483  
dc.journal.pais
Países Bajos  
dc.journal.ciudad
Amsterdam  
dc.description.fil
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación; Argentina  
dc.description.fil
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina  
dc.description.fil
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina  
dc.description.fil
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación; Argentina  
dc.journal.title
Information Processing & Management  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/abs/pii/S0306457320309729  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1016/j.ipm.2020.102483  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/2007.06616