Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

Digital surveillance in Latin American diseases outbreaks: information extraction from a novel Spanish corpus

Dellanzo, Antonella; Cotik, Viviana EricaIcon ; Lozano Barriga, Daniel Yunior; Mollapaza Apaza, Jonathan Jimmy; Palomino, Daniel; Schiaffino, Fernando; Yanque Aliaga, Alexander; Ochoa Luna, José
Fecha de publicación: 12/2022
Editorial: BioMed Central
Revista: BMC Bioinformatics
ISSN: 1471-2105
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Otras Ciencias de la Computación e Información

Resumen

Background: In order to detect threats to public health and to be well-prepared for endemic and pandemic illness outbreaks, countries usually rely on event-based surveillance (EBS) and indicator-based surveillance systems. Event-based surveillance systems are key components of early warning systems and focus on fast capturing of data to detect threat signals through channels other than traditional surveillance. In this study, we develop Natural Language Processing tools that can be used within EBS systems. In particular, we focus on information extraction techniques that enable digital surveillance to monitor Internet data and social media. Results: We created an annotated Spanish corpus from ProMED-mail health reports regarding disease outbreaks in Latin America. The corpus has been used to train algorithms for two information extraction tasks: named entity recognition and relation extraction. The algorithms, based on deep learning and rules, have been applied to recognize diseases, hosts, and geographical locations where a disease is occurring, among other entities and relations. In addition, an in-depth analysis of micro-average F1 metrics shows the suitability of our approaches for both tasks. Conclusions: The annotated corpus and algorithms presented could leverage the development of automated tools for extracting information from news and health reports written in Spanish. Moreover, this framework could be useful within EBS systems to support the early detection of Latin American disease outbreaks.
Palabras clave: DIGITAL SURVEILLANCE , DISEASES OUTBREAKS , EVENT-BASED SURVEILLANCE , NAMED ENTITY RECOGNITION , PROMED-MAIL , RELATION EXTRACTION , SPANISH CORPUS
Ver el registro completo
 
Archivos asociados
Thumbnail
 
Tamaño: 1.823Mb
Formato: PDF
.
Descargar
Licencia
info:eu-repo/semantics/openAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution 2.5 Unported (CC BY 2.5)
Identificadores
URI: http://hdl.handle.net/11336/217703
DOI: http://dx.doi.org/10.1186/s12859-022-05094-y
Colecciones
Articulos(ICC)
Articulos de INSTITUTO DE INVESTIGACION EN CIENCIAS DE LA COMPUTACION
Citación
Dellanzo, Antonella; Cotik, Viviana Erica; Lozano Barriga, Daniel Yunior; Mollapaza Apaza, Jonathan Jimmy; Palomino, Daniel; et al.; Digital surveillance in Latin American diseases outbreaks: information extraction from a novel Spanish corpus; BioMed Central; BMC Bioinformatics; 23; 1; 12-2022; 1-22
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES