Mostrar el registro sencillo del ítem

dc.contributor.author
Räsänen, Okko  
dc.contributor.author
Seshadri, Shreyas  
dc.contributor.author
Karadayi, Julien  
dc.contributor.author
Riebling, Eric  
dc.contributor.author
Bunce, John  
dc.contributor.author
Cristia, Alejandrina  
dc.contributor.author
Metze, Florian  
dc.contributor.author
Casillas, Marisa  
dc.contributor.author
Rosemberg, Celia Renata  
dc.contributor.author
Bergelson, Elika  
dc.contributor.author
Soderstrom, Melanie  
dc.date.available
2020-06-24T19:53:53Z  
dc.date.issued
2019-10  
dc.identifier.citation
Räsänen, Okko; Seshadri, Shreyas; Karadayi, Julien; Riebling, Eric; Bunce, John; et al.; Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech; Elsevier; Speech Communication; 113; 10-2019; 63-80  
dc.identifier.issn
0167-6393  
dc.identifier.uri
http://hdl.handle.net/11336/108130  
dc.description.abstract
Automatic word count estimation (WCE) from audio recordings can be used to quantify the amount of verbal communication in a recording environment. One key application of WCE is to measure language input heard by infants and toddlers in their natural environments, as captured by daylong recordings from microphones worn by the infants. Although WCE is nearly trivial for high-quality signals in high-resource languages, daylong recordings are substantially more challenging due to the unconstrained acoustic environments and the presence of near- and far-field speech. Moreover, many use cases of interest involve languages for which reliable ASR systems or even well-defined lexicons are not available. A good WCE system should also perform similarly for low- and high-resource languages in order to enable unbiased comparisons across different cultures and environments. Unfortunately, the current state-of-the-art solution, the LENA system, is based on proprietary software and has only been optimized for American English, limiting its applicability. In this paper, we build on existing work on WCE and present the steps we have taken towards a freely available system for WCE that can be adapted to different languages or dialects with a limited amount of orthographically transcribed speech data. Our system is based on language-independent syllabification of speech, followed by a language-dependent mapping from syllable counts (and a number of other acoustic features) to the corresponding word count estimates. We evaluate our system on samples from daylong infant recordings from six different corpora consisting of several languages and socioeconomic environments, all manually annotated with the same protocol to allow direct comparison. We compare a number of alternative techniques for the two key components in our system: speech activity detection and automatic syllabification of speech. As a result, we show that our system can reach relatively consistent WCE accuracy across multiple corpora and languages (with some limitations). In addition, the system outperforms LENA on three of the four corpora consisting of different varieties of English. We also demonstrate how an automatic neural network-based syllabifier, when trained on multiple languages, generalizes well to novel languages beyond the training data, outperforming two previously proposed unsupervised syllabifiers as a feature extractor for WCE.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Elsevier  
dc.rights
info:eu-repo/semantics/openAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/  
dc.subject
AUTOMATIC SYLLABIFICATION  
dc.subject
DAYLONG RECORDINGS  
dc.subject
LANGUAGE ACQUISITION  
dc.subject
NOISE ROBUSTNESS  
dc.subject
WORD COUNT ESTIMATION  
dc.subject.classification
Otras Ciencias de la Educación  
dc.subject.classification
Ciencias de la Educación  
dc.subject.classification
CIENCIAS SOCIALES  
dc.title
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech  
dc.type
info:eu-repo/semantics/article  
dc.type
info:ar-repo/semantics/artículo  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.date.updated
2020-04-24T17:46:37Z  
dc.journal.volume
113  
dc.journal.pagination
63-80  
dc.journal.pais
Países Bajos  
dc.journal.ciudad
Amsterdam  
dc.description.fil
Fil: Räsänen, Okko. Universidad de Tampere; Finlandia  
dc.description.fil
Fil: Seshadri, Shreyas. Aalto University; Finlandia  
dc.description.fil
Fil: Karadayi, Julien. Université Paris Sciences et Lettres; Francia  
dc.description.fil
Fil: Riebling, Eric. University of Carnegie Mellon; Estados Unidos  
dc.description.fil
Fil: Bunce, John. University of Manitoba; Canadá  
dc.description.fil
Fil: Cristia, Alejandrina. Université Paris Sciences et Lettres; Francia  
dc.description.fil
Fil: Metze, Florian. University of Carnegie Mellon; Estados Unidos  
dc.description.fil
Fil: Casillas, Marisa. Max Planck Institute For Psycholinguistics; Países Bajos  
dc.description.fil
Fil: Rosemberg, Celia Renata. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Saavedra 15. Centro Interdisciplinario de Investigaciones en Psicología Matemática y Experimental Dr. Horacio J. A. Rimoldi; Argentina  
dc.description.fil
Fil: Bergelson, Elika. University of Duke; Estados Unidos  
dc.description.fil
Fil: Soderstrom, Melanie. University of Manitoba; Canadá  
dc.journal.title
Speech Communication  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/http://dx.doi.org/10.1016/j.specom.2019.08.005  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167639318304205