Mostrar el registro sencillo del ítem
dc.contributor.author
Perez Correa, Ignacio
dc.contributor.author
Giunta, Pablo Daniel
dc.contributor.author
Mariño, Fernando Javier
dc.contributor.author
Francesconi, Javier Andres
dc.date.available
2024-11-25T14:56:30Z
dc.date.issued
2023-12
dc.identifier.citation
Perez Correa, Ignacio; Giunta, Pablo Daniel; Mariño, Fernando Javier; Francesconi, Javier Andres; Transformer-Based Representation of Organic Molecules for Potential Modeling of Physicochemical Properties; American Chemical Society; Journal of Chemical Information and Modeling; 63; 24; 12-2023; 7676-7688
dc.identifier.uri
http://hdl.handle.net/11336/248577
dc.description.abstract
In this work, we study the use of three configurations of an autoencoder neural network to process organic substances with the aim of generating meaningful molecular descriptors that can be employed to develop property prediction models. A total of 18,322,500 compounds represented as SMILES strings were used to train the model, demonstrating that a latent space of 24 units is able to adequately reconstruct the data. After AE training, an analysis of the latent space properties in terms of compound similarity was carried out, indicating that this space possesses desired properties for the potential development of models for forecasting physical properties of organiccompounds. As a final step, a QSPR model was developed to predict the boiling point of chemical substances based on the AE descriptors.5276 substances were used for the regression task, and the predictive ability was compared with models available in the literature evaluated on the same database. The final AE model has an overall error of 1.40% (1.39% with augmented SMILES) in the prediction of the boiling temperature, while other models have errors between 2.0 and 3.2%. This shows that the SMILES representation is comparable and even outperforms the state-of-the-art representations widely used in the literature.
dc.format
application/pdf
dc.language.iso
eng
dc.publisher
American Chemical Society
dc.rights
info:eu-repo/semantics/restrictedAccess
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.subject
Autoencoder neural network
dc.subject
SMILES
dc.subject
Organic compounds
dc.subject
QSPR model
dc.subject.classification
Otras Ingeniería Química
dc.subject.classification
Ingeniería Química
dc.subject.classification
INGENIERÍAS Y TECNOLOGÍAS
dc.title
Transformer-Based Representation of Organic Molecules for Potential Modeling of Physicochemical Properties
dc.type
info:eu-repo/semantics/article
dc.type
info:ar-repo/semantics/artículo
dc.type
info:eu-repo/semantics/publishedVersion
dc.date.updated
2024-11-21T12:01:59Z
dc.identifier.eissn
1549-960X
dc.journal.volume
63
dc.journal.number
24
dc.journal.pagination
7676-7688
dc.journal.pais
Estados Unidos
dc.journal.ciudad
Maryland
dc.description.fil
Fil: Perez Correa, Ignacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles. Universidad de Buenos Aires. Facultad de Ingeniería. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles; Argentina
dc.description.fil
Fil: Giunta, Pablo Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles. Universidad de Buenos Aires. Facultad de Ingeniería. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles; Argentina
dc.description.fil
Fil: Mariño, Fernando Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles. Universidad de Buenos Aires. Facultad de Ingeniería. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles; Argentina
dc.description.fil
Fil: Francesconi, Javier Andres. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles. Universidad de Buenos Aires. Facultad de Ingeniería. Instituto de Tecnologías del Hidrogeno y Energias Sostenibles; Argentina
dc.journal.title
Journal of Chemical Information and Modeling
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1021/acs.jcim.3c01548
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://pubs.acs.org/doi/10.1021/acs.jcim.3c01548
Archivos asociados