Bioinspired sparse spectro-temporal representation of speech for robust classification

Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo

doi:10.1016/j.csl.2012.02.002

Artículo

Bioinspired sparse spectro-temporal representation of speech for robust classification

Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto Icon

; Rufiner, Hugo Leonardo Icon

Fecha de publicación: 10/2012

Editorial: Academic Press Ltd - Elsevier Science Ltd

Revista: Computer Speech And Language

ISSN: 0885-2308

Idioma: Inglés

Tipo de recurso: Artículo publicado

Clasificación temática:

Otras Ingeniería Eléctrica, Ingeniería Electrónica e Ingeniería de la Información

Resumen

In this work, a first approach to a robust phoneme recognition task by means of a biologically inspired feature extraction method is presented. The proposed technique provides an approximation to the speech signal representation at the auditory cortical level. It is based on an optimal dictionary of atoms, estimated from auditory spectrograms, and the Matching Pursuit algorithm to approximate the cortical activations. This provides a sparse coding with intrinsic noise robustness, which can be therefore exploited when using the system in adverse environments. The recognition task consisted in the classification of a set of 5 easily confused English phonemes, in both clean and noisy conditions. Multilayer perceptrons were trained as classifiers and the performance was compared to other classic and robust parameterizations: the auditory spectrogram, a probabilistic optimum filtering on Mel frequency cepstral coefficients and the perceptual linear prediction coefficients. Results showed a significant improvement in the recognition rate of clean and noisy phonemes by the cortical representation over these other parameterizations.

Palabras clave: APPROXIMATED AUDITORY CORTICAL REPRESENTATION , ROBUST PHONEME RECOGNITION , SPARSE CODING

Ver el registro completo

Archivos asociados

Tamaño: 1005.Kb

Formato: PDF

Descargar

Licencia

Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)

Identificadores

URI: http://hdl.handle.net/11336/96495

URL: http://www.sciencedirect.com/science/article/pii/S0885230812000125

DOI: http://dx.doi.org/10.1016/j.csl.2012.02.002

Colecciones

Articulos(CCT - SANTA FE)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - SANTA FE

Citación

Martínez, César Ernesto; Goddard, J.; Milone, Diego Humberto; Rufiner, Hugo Leonardo; Bioinspired sparse spectro-temporal representation of speech for robust classification; Academic Press Ltd - Elsevier Science Ltd; Computer Speech And Language; 26; 5; 10-2012; 336-348

Altmétricas