Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

Topic relevance and diversity in information retrieval from large datasets: A multi-objective evolutionary algorithm approach

Cecchini, Rocío LujánIcon ; Lorenzetti, Carlos MartinIcon ; Maguitman, Ana GabrielaIcon ; Ponzoni, IgnacioIcon
Fecha de publicación: 08/2018
Editorial: Elsevier Science
Revista: Applied Soft Computing
ISSN: 1568-4946
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Ciencias de la Computación

Resumen

Enabling effective information search is an increasing problem, as technology enhances the ability to publish information rapidly, and large quantities of information are instantly available for retrieval. In this scenario, topical search is the process of searching for material that is relevant to a given topic. Multi-objective Evolutionary Algorithms have demonstrated great potential for addressing the topical search problem in very large datasets. In an evolutionary approach to topical search, a population of queries is automatically generated from a given topic, and the population of queries then evolves towards successively better candidate queries. Despite the promise of this approach, previous studies have revealed a common genotypic phenomenon: throughout evolution, the population tends to converge to almost identical sets of terms. This situation reduces the solution set to a few queries and leads to the exploration of a very limited region of the search space, which constitutes a limitation when users require different options from a topical search tool. This paper proposes and evaluates strategies to favor diversity in evolutionary topical search. These strategies rely on novel fitness functions, different parameterization for the crossover and mutation rates, and the use of multiple populations to favor diversity preservation. Experimental results conducted using these strategies in combination with the NSGA-II algorithm on a dataset consisting of more than 350,000 labeled web pages indicate that the proposed strategies show great promise for searching very large datasets, by helping to achieve query and search result diversity without giving up precision.
Palabras clave: DIVERSITY PRESERVATION , INFORMATION RETRIEVAL , QUERY REFORMULATION , TOPIC MODELING
Ver el registro completo
 
Archivos asociados
Thumbnail
 
Tamaño: 8.159Mb
Formato: PDF
.
Descargar
Licencia
info:eu-repo/semantics/openAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Identificadores
URI: http://hdl.handle.net/11336/89021
URL: https://www.sciencedirect.com/science/article/pii/S1568494617306798
DOI: http://dx.doi.org/10.1016/j.asoc.2017.11.016
Colecciones
Articulos(CCT - BAHIA BLANCA)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - BAHIA BLANCA
Citación
Cecchini, Rocío Luján; Lorenzetti, Carlos Martin; Maguitman, Ana Gabriela; Ponzoni, Ignacio; Topic relevance and diversity in information retrieval from large datasets: A multi-objective evolutionary algorithm approach; Elsevier Science; Applied Soft Computing; 69; 8-2018; 749-770
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES