Por favor, use este identificador para citar o enlazar este ítem: http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1894
Título : Speech recognition using deep neural networks trained with non-uniform frame-level cost functions
Autor : 31249
Fecha de publicación : nov-2017
Editorial : IEEE
Resumen : The aim of this paper is to present two new variations of the frame-level cost function for training a Deep neural network in order to achieve better word error rates in speech recognition. Minimization functions of a neural network are salient aspects to deal with when researchers are working on machine learning, and hence their improvement is a process of constant evolution. In the first proposed method, the conventional cross-entropy function can be mapped to a nonuniform loss function based on its corresponding extropy (a complementary dual function), enhancing the frames that have ambiguity in their belonging to specific senones (tied-triphone states in a hidden Markov model). The second proposition is a fusion of the proposed mapped cross-entropy and the boosted cross-entropy function, which emphasizes those frames with low target posterior probability. The developed approaches have been performed by using a personalized mid-vocabulary speaker-independent voice corpus. This dataset is employed for recognition of digit strings and personal name lists in Spanish from the northern central part of Mexico on a connected-words phone dialing task. A relative word error rate improvement of 12.3% and 10.7% is obtained with the two proposed approaches, respectively, regarding the conventional well-established crossentropy objective function.
URI : http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1894
ISSN : 2573-0770
Otros identificadores : info:eu-repo/semantics/publishedVersion
Aparece en las colecciones: *Documentos Académicos*-- M. en Ciencias del Proc. de la Info.

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
72_Becerra_DelaRosa IEEEROPEC P1 2017.pdfBecerra_DelaRosa IEEEROPEC P1 2017373,94 kBAdobe PDFVisualizar/Abrir

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons Creative Commons