Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

A labeled medical records corpus for the timely detection of rare diseases using machine learning approaches

Rolando, Matias; Raggio, Victor; Naya, Hugo; Spangenberg, Lucia; Cagnina, Leticia CeciliaIcon
Fecha de publicación: 02/2025
Editorial: Nature
Revista: Scientific Reports
ISSN: 2045-2322
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Ciencias de la Información y Bioinformática

Resumen

Rare diseases (RDs) are a group of pathologies that individually affect less than 1 in 2000 people but collectively impact around 7% of the world’s population. Most of them affect children, are chronic and progressive, and have no specific treatment. RD patients face diagnostic challenges, with an average diagnosis time of 5 years, multiple specialist visits, and invasive procedures. This ‘diagnostic odyssey’ can be detrimental to their health. Machine learning (ML) has the potential to improve healthcare by providing more personalized and accurate patient management, diagnoses, and in some cases, treatments. Leveraging the MIMIC-III database and additional medical notes from different sources such as in-house data, PubMed and chatGPT, we propose a labeled dataset for early RD detection in hospital settings. Applying various supervised ML methods, including logistic regression, decision trees, support vector machine (SVM), deep learning methods (LSTM and CNN), and Transformers (BERT), we validated the use of the proposed resource, achieving 92.7% F-measure and a 96% AUC using SVM. These findings highlight the potential of ML in redirecting RD patients towards more accurate diagnostic pathways and presents a corpus that can be used for future development and refinements.
Palabras clave: RARE DISEASES , MACHINE LEARNING , ARTIFICIAL CORPUS
Ver el registro completo
 
Archivos asociados
Thumbnail
 
Tamaño: 1.460Mb
Formato: PDF
.
Descargar
Licencia
info:eu-repo/semantics/openAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Atribución-NoComercial-SinDerivadas 2.5 Argentina (CC BY-NC-ND 2.5 AR)
Identificadores
URI: http://hdl.handle.net/11336/273229
URL: https://www.nature.com/articles/s41598-025-90450-0
DOI: http://dx.doi.org/10.1038/s41598-025-90450-0
Colecciones
Articulos(CCT - SAN LUIS)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SAN LUIS
Citación
Rolando, Matias; Raggio, Victor; Naya, Hugo; Spangenberg, Lucia; Cagnina, Leticia Cecilia; A labeled medical records corpus for the timely detection of rare diseases using machine learning approaches; Nature; Scientific Reports; 15; 1; 2-2025; 1-10
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES