Artículo
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces
Fecha de publicación:
01/2013
Editorial:
Elsevier
Revista:
Journal of Discrete Algorithms
ISSN:
1570-8667
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency.
Palabras clave:
Similarity Search
,
Diverification
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(CCT - SAN LUIS)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SAN LUIS
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SAN LUIS
Citación
Gil Costa, Graciela Verónica; Santos, Rodrygo L. T.; Macdonald, Craig; Ounis, Iadh; Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces; Elsevier; Journal of Discrete Algorithms; 18; 1-2013; 75-88
Compartir
Altmétricas