Mostrar el registro sencillo del ítem
dc.contributor.author
Dellanzo, Antonella
dc.contributor.author
Cotik, Viviana Erica
dc.contributor.author
Lozano Barriga, Daniel Yunior
dc.contributor.author
Mollapaza Apaza, Jonathan Jimmy
dc.contributor.author
Palomino, Daniel
dc.contributor.author
Schiaffino, Fernando
dc.contributor.author
Yanque Aliaga, Alexander
dc.contributor.author
Ochoa Luna, José
dc.date.available
2023-11-10T12:35:41Z
dc.date.issued
2022-12
dc.identifier.citation
Dellanzo, Antonella; Cotik, Viviana Erica; Lozano Barriga, Daniel Yunior; Mollapaza Apaza, Jonathan Jimmy; Palomino, Daniel; et al.; Digital surveillance in Latin American diseases outbreaks: information extraction from a novel Spanish corpus; BioMed Central; BMC Bioinformatics; 23; 1; 12-2022; 1-22
dc.identifier.issn
1471-2105
dc.identifier.uri
http://hdl.handle.net/11336/217703
dc.description.abstract
Background: In order to detect threats to public health and to be well-prepared for endemic and pandemic illness outbreaks, countries usually rely on event-based surveillance (EBS) and indicator-based surveillance systems. Event-based surveillance systems are key components of early warning systems and focus on fast capturing of data to detect threat signals through channels other than traditional surveillance. In this study, we develop Natural Language Processing tools that can be used within EBS systems. In particular, we focus on information extraction techniques that enable digital surveillance to monitor Internet data and social media. Results: We created an annotated Spanish corpus from ProMED-mail health reports regarding disease outbreaks in Latin America. The corpus has been used to train algorithms for two information extraction tasks: named entity recognition and relation extraction. The algorithms, based on deep learning and rules, have been applied to recognize diseases, hosts, and geographical locations where a disease is occurring, among other entities and relations. In addition, an in-depth analysis of micro-average F1 metrics shows the suitability of our approaches for both tasks. Conclusions: The annotated corpus and algorithms presented could leverage the development of automated tools for extracting information from news and health reports written in Spanish. Moreover, this framework could be useful within EBS systems to support the early detection of Latin American disease outbreaks.
dc.format
application/pdf
dc.language.iso
eng
dc.publisher
BioMed Central
dc.rights
info:eu-repo/semantics/openAccess
dc.rights.uri
https://creativecommons.org/licenses/by/2.5/ar/
dc.subject
DIGITAL SURVEILLANCE
dc.subject
DISEASES OUTBREAKS
dc.subject
EVENT-BASED SURVEILLANCE
dc.subject
NAMED ENTITY RECOGNITION
dc.subject
PROMED-MAIL
dc.subject
RELATION EXTRACTION
dc.subject
SPANISH CORPUS
dc.subject.classification
Otras Ciencias de la Computación e Información
dc.subject.classification
Ciencias de la Computación e Información
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS
dc.title
Digital surveillance in Latin American diseases outbreaks: information extraction from a novel Spanish corpus
dc.type
info:eu-repo/semantics/article
dc.type
info:ar-repo/semantics/artículo
dc.type
info:eu-repo/semantics/publishedVersion
dc.date.updated
2023-11-09T14:17:10Z
dc.journal.volume
23
dc.journal.number
1
dc.journal.pagination
1-22
dc.journal.pais
Reino Unido
dc.journal.ciudad
Londres
dc.description.fil
Fil: Dellanzo, Antonella. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina
dc.description.fil
Fil: Cotik, Viviana Erica. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina
dc.description.fil
Fil: Lozano Barriga, Daniel Yunior. Universidad Católica San Pablo; Perú
dc.description.fil
Fil: Mollapaza Apaza, Jonathan Jimmy. Universidad Católica San Pablo; Perú
dc.description.fil
Fil: Palomino, Daniel. Universidad Católica San Pablo; Perú
dc.description.fil
Fil: Schiaffino, Fernando. Universidad de Buenos Aires. Facultad de Filosofía y Letras; Argentina
dc.description.fil
Fil: Yanque Aliaga, Alexander. Universidad Católica San Pablo; Perú
dc.description.fil
Fil: Ochoa Luna, José. Universidad Católica San Pablo; Perú
dc.journal.title
BMC Bioinformatics
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1186/s12859-022-05094-y
Archivos asociados