Mostrar el registro sencillo del ítem

dc.contributor.author
Mitra, Vikramjit  
dc.contributor.author
Franco, Horacio  
dc.contributor.author
Stern, Richard M.  
dc.contributor.author
Van Hout, Julien  
dc.contributor.author
Ferrer, Luciana  
dc.contributor.author
Graciarena, Martin  
dc.contributor.author
Wang, Wen  
dc.contributor.author
Vergyri, Dimitra  
dc.contributor.author
Alwan, Abeer  
dc.contributor.author
Hansen, John H. L.  
dc.contributor.other
Watanabe, Shinji  
dc.contributor.other
Delcroix, Marc  
dc.contributor.other
Metze, Florian  
dc.contributor.other
Hershey, John R.  
dc.date.available
2022-07-26T14:48:46Z  
dc.date.issued
2017  
dc.identifier.citation
Mitra, Vikramjit; Franco, Horacio; Stern, Richard M.; Van Hout, Julien; Ferrer, Luciana; et al.; Robust features in deep-learning-based speech recognition; Springer Nature Switzerland AG; 2017; 183-212  
dc.identifier.isbn
978-3-319-64679-4  
dc.identifier.uri
http://hdl.handle.net/11336/163169  
dc.description.abstract
Recent progress in deep learning has revolutionized speech recognition research, with Deep Neural Networks (DNNs) becoming the new state of the art for acoustic modeling. DNNs offer significantly lower speech recognition error rates compared to those provided by the previously used Gaussian Mixture Models (GMMs). Unfortunately, DNNs are data sensitive, and unseen data conditions can deteriorate their performance. Acoustic distortions such as noise, reverberation, channel differences, etc. add variation to the speech signal, which in turn impact DNN acoustic model performance. A straightforward solution to this issue is training the DNN models with these types of variation, which typically provides quite impressive performance. However, anticipating such variation is not always possible; in these cases, DNN recognition performance can deteriorate quite sharply. To avoid subjecting acoustic models to such variation, robust features have traditionally been used to create an invariant representation of the acoustic space. Most commonly, robust feature-extraction strategies have explored three principal areas: (a) enhancing the speech signal, with a goal of improving the perceptual quality of speech; (b) reducing the distortion footprint, with signal-theoretic techniques used to learn the distortion characteristics and subsequently filter them out of the speech signal; and finally (c) leveraging knowledge from auditory neuroscience and psychoacoustics, by using robust features inspired by auditory perception. In this chapter, we present prominent robust feature-extraction strategies explored by the speech recognition research community, and we discuss their relevance to coping with data-mismatch problems in DNN-based acoustic modeling. We present results demonstrating the efficacy of robust features in the new paradigm of DNN acoustic models. And we discuss future directions in feature design for making speech recognition systems more robust to unseen acoustic conditions. Note that the approaches discussed in this chapter focus primarily on single channel data.  
dc.format
application/pdf  
dc.language.iso
eng  
dc.publisher
Springer Nature Switzerland AG  
dc.rights
info:eu-repo/semantics/restrictedAccess  
dc.rights.uri
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/  
dc.subject
SPEECH RECOGNITION  
dc.subject
ROBUST FEATURES  
dc.subject
DEEP LEARNING  
dc.subject.classification
Ciencias de la Información y Bioinformática  
dc.subject.classification
Ciencias de la Computación e Información  
dc.subject.classification
CIENCIAS NATURALES Y EXACTAS  
dc.title
Robust features in deep-learning-based speech recognition  
dc.type
info:eu-repo/semantics/publishedVersion  
dc.type
info:eu-repo/semantics/bookPart  
dc.type
info:ar-repo/semantics/parte de libro  
dc.date.updated
2022-07-25T15:38:32Z  
dc.journal.pagination
183-212  
dc.journal.pais
Suiza  
dc.journal.ciudad
Cham  
dc.description.fil
Fil: Mitra, Vikramjit. SRI International; Estados Unidos  
dc.description.fil
Fil: Franco, Horacio. SRI International; Estados Unidos  
dc.description.fil
Fil: Stern, Richard M.. University of Carnegie Mellon; Estados Unidos  
dc.description.fil
Fil: Van Hout, Julien. SRI International; Estados Unidos  
dc.description.fil
Fil: Ferrer, Luciana. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina  
dc.description.fil
Fil: Graciarena, Martin. SRI International; Estados Unidos  
dc.description.fil
Fil: Wang, Wen. SRI International; Estados Unidos  
dc.description.fil
Fil: Vergyri, Dimitra. SRI International; Estados Unidos  
dc.description.fil
Fil: Alwan, Abeer. University of California at Los Angeles; Estados Unidos  
dc.description.fil
Fil: Hansen, John H. L.. University of Texas; Estados Unidos  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/url/https://link.springer.com/chapter/10.1007/978-3-319-64680-0_8  
dc.relation.alternativeid
info:eu-repo/semantics/altIdentifier/doi/https://doi.org/10.1007/978-3-319-64680-0_8  
dc.conicet.paginas
436  
dc.source.titulo
New era for robust speech recognition: exploiting deep learning