Artículo
PhISCO: A simple method to infer phenotypes from protein sequences
Hernandez Berthet, Ayelén S.; Aptekmann, Ariel
; Tejero, Jesús; Sánchez Miguel, Ignacio Enrique
; Noguera, Martín Ezequiel
; Roman, Ernesto Andres
; Tejero, Jesús; Sánchez Miguel, Ignacio Enrique
; Noguera, Martín Ezequiel
; Roman, Ernesto Andres
Fecha de publicación:
10/2023
Editorial:
Cold Spring Harbor Laboratory Press
Revista:
bioRxiv
e-ISSN:
2692-8205
Idioma:
Inglés
Tipo de recurso:
Artículo publicado
Clasificación temática:
Resumen
Although protein sequences encode the information for folding and function, understanding their link is not an easy task. Unluckily, the prediction of how specific amino acids contribute to these features is still considerably impaired. Here, we developed PhISCO, Phenotype Inference from Sequence COmparisons, a simple algorithm that finds positions associated with any quantitative phenotype and predicts their values. From a few hundred sequences from four different protein families, we performed multiple sequence alignments and calculated per-position pairwise differences for both the sequence and the observed phenotypes. We found that from 3 to 10 positions, depending on the studied case, were enough to identify positions associated with the phenotypes and perform quantitative predictions of them. Here we show that these strong correlations can be found using individual positions while an improvement is achieved when the most correlated positions are jointly analyzed. Noteworthy, we performed phenotype predictions using a simple linear model that links per-position divergences and differences in observed phenotypes. We also show that although extremely simple, predictions are comparable to the state-of-art methodologies which, in most of the cases, are far more complex. All of the calculations are obtained at a very low information cost since the only input needed is a multiple sequence alignment of protein sequences with their associated quantitative phenotype. The diversity of the explored systems makes PhISCO a valuable tool to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.
Palabras clave:
Prediction
,
Phenotype
,
Evolution
,
Proteins
Archivos asociados
Licencia
Identificadores
Colecciones
Articulos(IQUIBICEN)
Articulos de INSTITUTO DE QUIMICA BIOLOGICA DE LA FACULTAD DE CS. EXACTAS Y NATURALES
Articulos de INSTITUTO DE QUIMICA BIOLOGICA DE LA FACULTAD DE CS. EXACTAS Y NATURALES
Articulos(IQUIFIB)
Articulos de INST.DE QUIMICA Y FISICO-QUIMICA BIOLOGICAS "PROF. ALEJANDRO C. PALADINI"
Articulos de INST.DE QUIMICA Y FISICO-QUIMICA BIOLOGICAS "PROF. ALEJANDRO C. PALADINI"
Citación
Hernandez Berthet, Ayelén S.; Aptekmann, Ariel; Tejero, Jesús; Sánchez Miguel, Ignacio Enrique; Noguera, Martín Ezequiel; et al.; PhISCO: A simple method to infer phenotypes from protein sequences; Cold Spring Harbor Laboratory Press; bioRxiv; 10-2023; 1-43
Compartir
Altmétricas