Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

Feature Selection for Polymer Informatics: Evaluating Scalability and Robustness of the FS4RVDD Algorithm using Synthetic Polydisperse Datasets

Cravero, FiorellaIcon ; Schustik, Santiago; Martínez, María JimenaIcon ; Vázquez, Gustavo; Diaz, Monica FatimaIcon ; Ponzoni, IgnacioIcon
Fecha de publicación: 24/02/2020
Editorial: American Chemical Society
Revista: Journal of Chemical Information and Modeling
ISSN: 1549-9596
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Compuestos; Otras Ciencias de la Computación e Información

Resumen

The feature selection (FS) process is a key step in the Quantitative Structure-Property Relationship (QSPR) modeling of physicochemical properties in Cheminformatics. In particular, the inference of QSPR models for polymeric material properties constitutes a complex problem because of the uncertainty introduced by the polydispersity of these materials. The main challenge is how to capture the polydispersity information from the molecular weight distribution (MWD) curve to achieve a more effective computational representation of polymeric materials. To date, most of the existing QSPR techniques use only a single molecule to represent each of these materials, but polydispersity is not considered. Consequently, QSPR models obtained by these approaches are being oversimplified. For this reason, we introduced in a previous work a new FS algorithm called Feature Selection for Random Variables with Discrete Distribution (FS4RVDD), which allows dealing with polydisperse data. In the present paper, we evaluate both the scalability and the robustness of the FS4RVDD algorithm. In this sense, we generated synthetic data by varying and combining different parameters: the size of the database, the cardinality of the selected feature subsets, the presence of noise in the data, and the type of correlation (linear and nonlinear). Moreover, the performances obtained by FS4RVDD were contrasted with traditional FS techniques applied to different simplified representations of polymeric materials. The obtained results show that the FS4RVDD algorithm outperformed the traditional FS methods in all proposed scenarios, which suggest the need of an algorithm such as FS4RVDD to deal with the uncertainty that polydispersity introduces in human-made polymers.
Palabras clave: ALGORITHMS , MACHINE LEARNING , PHYSICAL AND CHEMICAL PROPERTIES , POLYMERS
Ver el registro completo
 
Archivos asociados
Thumbnail
 
Tamaño: 3.508Mb
Formato: PDF
.
Descargar
Licencia
info:eu-repo/semantics/embargoedAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial 2.5 Unported (CC BY-NC 2.5)
Identificadores
URI: http://hdl.handle.net/11336/111958
URL: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b00867
DOI: http://dx.doi.org/10.1021/acs.jcim.9b00867
Colecciones
Articulos(PLAPIQUI)
Articulos de PLANTA PILOTO DE INGENIERIA QUIMICA (I)
Citación
Cravero, Fiorella; Schustik, Santiago; Martínez, María Jimena; Vázquez, Gustavo; Diaz, Monica Fatima; et al.; Feature Selection for Polymer Informatics: Evaluating Scalability and Robustness of the FS4RVDD Algorithm using Synthetic Polydisperse Datasets; American Chemical Society; Journal of Chemical Information and Modeling; 60; 2; 24-2-2020; 592-603
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES