Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition

Becerra, Aldonso; De la Rosa Vargas, José Ismael; González Ramírez, Efrén; Pedroza, David; Escalante, N. Iracemi

DSpace Principal
→
Maestría en Ciencias del Procesamiento de la Información
→
*Documentos Académicos*-- M. en Ciencias del Proc. de la Info.
→
Ver ítem

dc.contributor	31249	es_ES
dc.contributor.other	0000-0002-7337-8974	es_ES
dc.contributor.other	https://orcid.org/0000-0002-7337-8974
dc.contributor.other	https://orcid.org/0000-0002-8060-6170
dc.coverage.spatial	Global	es_ES
dc.creator	Becerra, Aldonso
dc.creator	De la Rosa Vargas, José Ismael
dc.creator	González Ramírez, Efrén
dc.creator	Pedroza, David
dc.creator	Escalante, N. Iracemi
dc.date.accessioned	2020-04-16T19:13:20Z
dc.date.available	2020-04-16T19:13:20Z
dc.date.issued	2018-10
dc.identifier	info:eu-repo/semantics/publishedVersion	es_ES
dc.identifier.issn	1380-7501	es_ES
dc.identifier.issn	1573-7721	es_ES
dc.identifier.uri	http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/1714
dc.identifier.uri	https://doi.org/10.48779/mz95-hr57
dc.description.abstract	The aim of this paper is to exhibit two new variations of the frame-level cost function for training a deep neural network in order to achieve better word error rates in speech recognition. Optimization methods and their minimization functions are underlying aspects to consider when someone is working on neural nets, and hence their improvement is one of the salient objectives of researchers, and this paper deals in part with such a situation. The first proposed framework is based on the concept of extropy, the complementary dual function of an uncertainty measure. The conventional cross entropy function can be mapped to a non-uniform loss function based on its corresponding extropy, enhancing the frames that have ambiguity in their belonging to specific senones. The second proposal makes a fusion of the presented mapped cross-entropy function and the idea of boosted cross-entropy, which emphasizes those frames with low target posterior probability.	es_ES
dc.language.iso	eng	es_ES
dc.publisher	Springer	es_ES
dc.relation	https://doi.org/10.1007/s11042- 018-5917-5	es_ES
dc.relation.uri	generalPublic	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 Estados Unidos de América	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.source	Multimedia Tools Applications, Vol. 77, No. 20, pp. 27231-27267	es_ES
dc.subject.classification	INGENIERIA Y TECNOLOGIA [7]	es_ES
dc.subject.other	Speech recognition	es_ES
dc.subject.other	Neural networks	es_ES
dc.subject.other	Deep learning	es_ES
dc.subject.other	Cross-entropy	es_ES
dc.subject.other	Extropy	es_ES
dc.subject.other	Frame-level loss function	es_ES
dc.title	Training deep neural networks with non-uniform frame-level cost function for automatic speech recognition	es_ES
dc.type	info:eu-repo/semantics/article	es_ES