Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Capítulo de Libro

Robust features in deep-learning-based speech recognition

Título del libro: New era for robust speech recognition: exploiting deep learning

Mitra, Vikramjit; Franco, Horacio; Stern, Richard M.; Van Hout, Julien; Ferrer, LucianaIcon ; Graciarena, Martin; Wang, Wen; Vergyri, Dimitra; Alwan, Abeer; Hansen, John H. L.
Otros responsables: Watanabe, Shinji; Delcroix, Marc; Metze, Florian; Hershey, John R.
Fecha de publicación: 2017
Editorial: Springer Nature Switzerland AG
ISBN: 978-3-319-64679-4
Idioma: Inglés
Clasificación temática:
Ciencias de la Información y Bioinformática

Resumen

Recent progress in deep learning has revolutionized speech recognition research, with Deep Neural Networks (DNNs) becoming the new state of the art for acoustic modeling. DNNs offer significantly lower speech recognition error rates compared to those provided by the previously used Gaussian Mixture Models (GMMs). Unfortunately, DNNs are data sensitive, and unseen data conditions can deteriorate their performance. Acoustic distortions such as noise, reverberation, channel differences, etc. add variation to the speech signal, which in turn impact DNN acoustic model performance. A straightforward solution to this issue is training the DNN models with these types of variation, which typically provides quite impressive performance. However, anticipating such variation is not always possible; in these cases, DNN recognition performance can deteriorate quite sharply. To avoid subjecting acoustic models to such variation, robust features have traditionally been used to create an invariant representation of the acoustic space. Most commonly, robust feature-extraction strategies have explored three principal areas: (a) enhancing the speech signal, with a goal of improving the perceptual quality of speech; (b) reducing the distortion footprint, with signal-theoretic techniques used to learn the distortion characteristics and subsequently filter them out of the speech signal; and finally (c) leveraging knowledge from auditory neuroscience and psychoacoustics, by using robust features inspired by auditory perception. In this chapter, we present prominent robust feature-extraction strategies explored by the speech recognition research community, and we discuss their relevance to coping with data-mismatch problems in DNN-based acoustic modeling. We present results demonstrating the efficacy of robust features in the new paradigm of DNN acoustic models. And we discuss future directions in feature design for making speech recognition systems more robust to unseen acoustic conditions. Note that the approaches discussed in this chapter focus primarily on single channel data.
Palabras clave: SPEECH RECOGNITION , ROBUST FEATURES , DEEP LEARNING
Ver el registro completo
 
Archivos asociados
Tamaño: 284.3Kb
Formato: PDF
.
Solicitar
Licencia
info:eu-repo/semantics/restrictedAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Identificadores
URI: http://hdl.handle.net/11336/163169
URL: https://link.springer.com/chapter/10.1007/978-3-319-64680-0_8
DOI: https://doi.org/10.1007/978-3-319-64680-0_8
Colecciones
Capítulos de libros(OCA CIUDAD UNIVERSITARIA)
Capítulos de libros de OFICINA DE COORDINACION ADMINISTRATIVA CIUDAD UNIVERSITARIA
Citación
Mitra, Vikramjit; Franco, Horacio; Stern, Richard M.; Van Hout, Julien; Ferrer, Luciana; et al.; Robust features in deep-learning-based speech recognition; Springer Nature Switzerland AG; 2017; 183-212
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES