Mostrar el registro sencillo del ítem
dc.contributor.author
Hernandez Berthet, Ayelén S.
dc.contributor.author
Aptekmann, Ariel
dc.contributor.author
Tejero, Jesús
dc.contributor.author
Sánchez Miguel, Ignacio Enrique
dc.contributor.author
Noguera, Martín Ezequiel
dc.contributor.author
Roman, Ernesto Andres
dc.date.available
2025-12-15T10:15:21Z
dc.date.issued
2023-10
dc.identifier.citation
Hernandez Berthet, Ayelén S.; Aptekmann, Ariel; Tejero, Jesús; Sánchez Miguel, Ignacio Enrique; Noguera, Martín Ezequiel; et al.; PhISCO: A simple method to infer phenotypes from protein sequences; Cold Spring Harbor Laboratory Press; bioRxiv; 10-2023; 1-43
dc.identifier.uri
http://hdl.handle.net/11336/277636
dc.description.abstract
Although protein sequences encode the information for folding and function, understanding their link is not an easy task. Unluckily, the prediction of how specific amino acids contribute to these features is still considerably impaired. Here, we developed PhISCO, Phenotype Inference from Sequence COmparisons, a simple algorithm that finds positions associated with any quantitative phenotype and predicts their values. From a few hundred sequences from four different protein families, we performed multiple sequence alignments and calculated per-position pairwise differences for both the sequence and the observed phenotypes. We found that from 3 to 10 positions, depending on the studied case, were enough to identify positions associated with the phenotypes and perform quantitative predictions of them. Here we show that these strong correlations can be found using individual positions while an improvement is achieved when the most correlated positions are jointly analyzed. Noteworthy, we performed phenotype predictions using a simple linear model that links per-position divergences and differences in observed phenotypes. We also show that although extremely simple, predictions are comparable to the state-of-art methodologies which, in most of the cases, are far more complex. All of the calculations are obtained at a very low information cost since the only input needed is a multiple sequence alignment of protein sequences with their associated quantitative phenotype. The diversity of the explored systems makes PhISCO a valuable tool to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.
dc.format
application/pdf
dc.language.iso
eng
dc.publisher
Cold Spring Harbor Laboratory Press
dc.rights
info:eu-repo/semantics/openAccess
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.subject
Prediction
dc.subject
Phenotype
dc.subject
Evolution
dc.subject
Proteins
dc.subject.classification
Biofísica
dc.subject.classification
Ciencias Biológicas
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS
dc.title
PhISCO: A simple method to infer phenotypes from protein sequences
dc.type
info:eu-repo/semantics/article
dc.type
info:ar-repo/semantics/artículo
dc.type
info:eu-repo/semantics/publishedVersion
dc.date.updated
2024-11-15T20:16:55Z
dc.identifier.eissn
2692-8205
dc.journal.pagination
1-43
dc.journal.pais
Estados Unidos
dc.description.fil
Fil: Hernandez Berthet, Ayelén S.. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina
dc.description.fil
Fil: Aptekmann, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina. Rutgers University; Estados Unidos
dc.description.fil
Fil: Tejero, Jesús. University of Pittsburgh at Johnstown; Estados Unidos
dc.description.fil
Fil: Sánchez Miguel, Ignacio Enrique. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
dc.description.fil
Fil: Noguera, Martín Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Química y Físico-Química Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica. Instituto de Química y Físico-Química Biológicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina
dc.description.fil
Fil: Roman, Ernesto Andres. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Química y Físico-Química Biológicas "Prof. Alejandro C. Paladini". Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica. Instituto de Química y Físico-Química Biológicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina
dc.journal.title
bioRxiv
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://www.biorxiv.org/content/10.1101/2022.10.23.511734v2.full
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/https://doi.org/10.1101/2022.10.23.511734
Archivos asociados