Mostrar el registro sencillo del ítem

dc.contributor.author
Arroyuelo, Diego  
dc.contributor.author
Gil Costa, Graciela Verónica  
dc.contributor.author
González, Senén  
dc.contributor.author
Marín, Mauricio  
dc.contributor.author
Oyarzún, Mauricio  
dc.date.available
2023-05-11T14:04:14Z  
dc.date.issued
2012-03  
dc.identifier.citation
Arroyuelo, Diego; Gil Costa, Graciela Verónica; González, Senén; Marín, Mauricio; Oyarzún, Mauricio; Distributed search based on self-indexed compressed text; Pergamon-Elsevier Science Ltd; Information Processing & Management; 48; 5; 3-2012; 819-827  
dc.identifier.issn
0306-4573  
dc.identifier.uri
http://hdl.handle.net/11336/197197  
dc.description.abstract
Query response times within a fraction of a second in Web search engines are feasible due to the use of indexing and caching techniques, which are devised for large text collections partitioned and replicated into a set of distributed-memory processors. This paper proposes an alternative query processing method for this setting, which is based on a combination of self-indexed compressed text and posting lists caching. We show that a text self-index (i.e.; an index that compresses the text and is able to extract arbitrary parts of it) can be competitive with an inverted index if we consider the whole query process, which includes index decompression, ranking and snippet extraction time. The advantage is that within the space of the compressed document collection, one can carry out the posting lists generation, document ranking and snippet extraction. This significantly reduces the total number of processors involved in the solution of queries. Alternatively, for the same amount of hardware, the performance of the proposed strategy is better than that of the classical approach based on treating inverted indexes and corresponding documents as two separate entities in terms of processors and memory space.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Pergamon-Elsevier Science Ltd  
dc.rights
info:eu-repo/semantics/openAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/  
dc.subject
QUERY PROCESSING  
dc.subject
SELF-INDEXED COMPRESSED TEXT  
dc.subject
SNIPPET EXTRACTION  
dc.subject
WAVELET TREES  
dc.subject
WEB SEARCH ENGINES  
dc.subject.classification
Ciencias de la Computación  
dc.subject.classification
Ciencias de la Computación e Información  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
Distributed search based on self-indexed compressed text  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2023-04-20T12:33:05Z  
dc.journal.volume
48  
dc.journal.number
5  
dc.journal.pagination
819-827  
dc.journal.pais
Países Bajos  
dc.journal.ciudad
Amsterdam  
dc.description.fil
Fil: Arroyuelo, Diego. Yahoo! Research Latin America; Chile  
dc.description.fil
Fil: Gil Costa, Graciela Verónica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - San Luis; Argentina. Universidad Nacional de San Luis; Argentina. Yahoo! Research Latin America; Chile  
dc.description.fil
Fil: González, Senén. Yahoo! Research Latin America; Chile  
dc.description.fil
Fil: Marín, Mauricio. Universidad de Santiago de Chile; Chile. Yahoo! Research Latin America; Chile  
dc.description.fil
Fil: Oyarzún, Mauricio. Universidad de Santiago de Chile; Chile  
dc.journal.title
Information Processing & Management  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S0306457311000094  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1016/j.ipm.2011.01.008