Repositorio Institucional
Repositorio Institucional
CONICET Digital
  • Inicio
  • EXPLORAR
    • AUTORES
    • DISCIPLINAS
    • COMUNIDADES
  • Estadísticas
  • Novedades
    • Noticias
    • Boletines
  • Ayuda
    • General
    • Datos de investigación
  • Acerca de
    • CONICET Digital
    • Equipo
    • Red Federal
  • Contacto
JavaScript is disabled for your browser. Some features of this site may not work without it.
  • INFORMACIÓN GENERAL
  • RESUMEN
  • ESTADISTICAS
 
Artículo

Emilia: A speech corpus for Argentine Spanish text to speech synthesis

Torres, Humberto MaximilianoIcon ; Gurlekian, Jorge AlbertoIcon ; Evin, Diego AlexisIcon ; Cossio Mercado, Christian GustavoIcon
Fecha de publicación: 09/2019
Editorial: Springer
Revista: Language Resources And Evaluation
ISSN: 1574-020X
e-ISSN: 1574-0218
Idioma: Inglés
Tipo de recurso: Artículo publicado
Clasificación temática:
Otras Ingeniería Eléctrica, Ingeniería Electrónica e Ingeniería de la Información

Resumen

This paper introduces Emilia, a speech corpus created to build a female voice in Spanish spoken in Buenos Aires for the Aromo text-to-speech system. Aromo is a unit selection text-to-speech system, which employs diphones as units of synthesis. The key requirements and design criteria for Emilia were: to synthesize any text in Spanish into high-quality speech with a minimum corpus size. The text corpus was designed to guarantee the phonetic and prosodic coverage. A three-stage strategy was used: in the first stage, 741 sentences were designed with all of the syllables of Spanish spoken in Argentina, with and without stress, and in all positions within the word; in the second stage, 852 sentences were added to balance out the distribution of the diphones; and after a perceptual evaluation of the quality of synthesized speech, in the third and final stage, 625 sentences were added to achieve the specified unit coverage, and to introduce sentences with more complex syntactic and prosodic structures. Issues from all three corpus building stages are reported. The paper also presents the results from the quality perceptual evaluations of the synthesized voice. Emilia has a duration of three hours and 15 minutes; its speech quality synthesized with Aromo system is similar to the level obtained with commercial systems, with a real-time ratio less than one.
Palabras clave: ARGENTINE SPANISH , PHONETIC CORPUS , PHONETIC TRANSCRIPTION , SPEECH CORPUS DESIGN , TEXT-TO-SPEECH
Ver el registro completo
 
Archivos asociados
Tamaño: 485.4Kb
Formato: PDF
.
Solicitar
Licencia
info:eu-repo/semantics/restrictedAccess Excepto donde se diga explícitamente, este item se publica bajo la siguiente descripción: Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Unported (CC BY-NC-SA 2.5)
Identificadores
URI: http://hdl.handle.net/11336/112712
URL: http://link.springer.com/10.1007/s10579-019-09447-7
DOI: http://dx.doi.org/10.1007/s10579-019-09447-7
Colecciones
Articulos(CCT - CORDOBA)
Articulos de CTRO.CIENTIFICO TECNOL.CONICET - CORDOBA
Citación
Torres, Humberto Maximiliano; Gurlekian, Jorge Alberto; Evin, Diego Alexis; Cossio Mercado, Christian Gustavo; Emilia: A speech corpus for Argentine Spanish text to speech synthesis; Springer; Language Resources And Evaluation; 53; 3; 9-2019; 419-447
Compartir
Altmétricas
 

Enviar por e-mail
Separar cada destinatario (hasta 5) con punto y coma.
  • Facebook
  • X Conicet Digital
  • Instagram
  • YouTube
  • Sound Cloud
  • LinkedIn

Los contenidos del CONICET están licenciados bajo Creative Commons Reconocimiento 2.5 Argentina License

https://www.conicet.gov.ar/ - CONICET

Inicio

Explorar

  • Autores
  • Disciplinas
  • Comunidades

Estadísticas

Novedades

  • Noticias
  • Boletines

Ayuda

Acerca de

  • CONICET Digital
  • Equipo
  • Red Federal

Contacto

Godoy Cruz 2290 (C1425FQB) CABA – República Argentina – Tel: +5411 4899-5400 repositorio@conicet.gov.ar
TÉRMINOS Y CONDICIONES