Mostrar el registro sencillo del ítem

dc.contributor.author
Zermoglio, Paula Florencia  
dc.contributor.author
Guralnick, Robert P.  
dc.contributor.author
Wieczorek, John R.  
dc.date.available
2018-09-19T18:35:00Z  
dc.date.issued
2016-01  
dc.identifier.citation
Zermoglio, Paula Florencia; Guralnick, Robert P.; Wieczorek, John R.; A standardized reference data set for vertebrate taxon name resolution; Public Library of Science; Plos One; 11; 1; 1-2016; 1-20; e0146894  
dc.identifier.uri
http://hdl.handle.net/11336/60277  
dc.description.abstract
Taxonomic names associated with digitized biocollections labels have flooded into repositories such as GBIF, iDigBio and VertNet. The names on these labels are often misspelled, out of date, or present other problems, as they were often captured only once during accessioning of specimens, or have a history of label changes without clear provenance. Before records are reliably usable in research, it is critical that these issues be addressed. However, still missing is an assessment of the scope of the problem, the effort needed to solve it, and a way to improve effectiveness of tools developed to aid the process. We present a carefully human-vetted analysis of 1000 verbatim scientific names taken at random from those published via the data aggregator VertNet, providing the first rigorously reviewed, reference validation data set. In addition to characterizing formatting problems, human vetting focused on detecting misspelling, synonymy, and the incorrect use of Darwin Core. Our results reveal a sobering view of the challenge ahead, as less than 47% of name strings were found to be currently valid. More optimistically, nearly 97% of name combinations could be resolved to a currently valid name, suggesting that computer-aided approaches may provide feasible means to improve digitized content. Finally, we associated names back to biocollections records and fit logistic models to test potential drivers of issues. A set of candidate variables (geographic region, year collected, higher-level clade, and the institutional digitally accessible data volume) and their 2-way interactions all predict the probability of records having taxon name issues, based on model selection approaches. We strongly encourage further experiments to use this reference data set as a means to compare automated or computer-aided taxon name tools for their ability to resolve and improve the existing wealth of legacy data.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Public Library of Science  
dc.rights
info:eu-repo/semantics/openAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/  
dc.subject
Biocollections  
dc.subject
Data Curation  
dc.subject
Fitness for Use  
dc.subject
Gold Standard  
dc.subject
Taxon Names  
dc.subject
Validation  
dc.subject
Vertebrates  
dc.subject
Vertnet  
dc.subject.classification
Otras Ciencias Biológicas  
dc.subject.classification
Ciencias Biológicas  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
A standardized reference data set for vertebrate taxon name resolution  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2018-09-19T14:36:05Z  
dc.identifier.eissn
1932-6203  
dc.journal.volume
11  
dc.journal.number
1  
dc.journal.pagination
1-20; e0146894  
dc.journal.pais
Estados Unidos  
dc.journal.ciudad
San Francisco  
dc.description.fil
Fil: Zermoglio, Paula Florencia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Ecología, Genética y Evolución de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Ecología, Genética y Evolución de Buenos Aires; Argentina. Université François Rabelais; Francia  
dc.description.fil
Fil: Guralnick, Robert P.. University of Florida; Estados Unidos  
dc.description.fil
Fil: Wieczorek, John R.. University of California at Berkeley; Estados Unidos  
dc.journal.title
Plos One  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1371/journal.pone.0146894  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0146894