Effects of Speech Duration on Preserving the Identity of Synthesized Voice

Supmee P.; Suwanmalai K.; Hanchoenkul N.; Sae-Bae N.; Khomkham B.

Please use this identifier to cite or link to this item: https://ir.swu.ac.th/jspui/handle/123456789/29395

Full metadata record

DC Field	Value	Language
dc.contributor.author	Supmee P.
dc.contributor.author	Suwanmalai K.
dc.contributor.author	Hanchoenkul N.
dc.contributor.author	Sae-Bae N.
dc.contributor.author	Khomkham B.
dc.contributor.other	Srinakharinwirot University
dc.date.accessioned	2023-11-15T02:08:32Z	-
dc.date.available	2023-11-15T02:08:32Z	-
dc.date.issued	2023
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85169289781&doi=10.1109%2fJCSSE58229.2023.10202157&partnerID=40&md5=9b99f0441915f4efbe245c2f5f87f506
dc.identifier.uri	https://ir.swu.ac.th/jspui/handle/123456789/29395	-
dc.description.abstract	This paper studied the identity preserving performance of the speech synthesized model when durations of speech samples in Thai language were varied. In particular, two experiments were designed to investigate such property of the model. The first experiment was set to reflect the identity preserving performance of the identity vector derived from speech synthesized model. The results suggest that better identity vector quality is achieved when the longer duration of a Thai speech signal is used as shorter speech signals result in identity vectors that are more dispersed. The second experiment was set to directly reflect the identity preserving performance of the synthesized voice signal generated from the speech synthesized model in independent speaker recognition systems. The results similarly suggest that a better identity-preserving voice signal is achieved when the longer duration of Thai speech signal is used as shorter speech signals result in synthesized voice signals with larger distances from the real voice signals. Therefore, the trade-off between usability and quality of synthesized voices must be carefully considered when developing applications from such models. In addition, the investigation framework used in this study could be used to evaluate the newly developed identity-preserving speech synthesized models. © 2023 IEEE.
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.subject	Speaker recognition
dc.subject	Speech synthesis
dc.subject	Voice quality
dc.subject	Voice signal
dc.title	Effects of Speech Duration on Preserving the Identity of Synthesized Voice
dc.type	Conference paper
dc.rights.holder	Scopus
dc.identifier.bibliograpycitation	Proceedings of JCSSE 2023 - 20th International Joint Conference on Computer Science and Software Engineering. Vol , No. (2023), p.242-246
dc.identifier.doi	10.1109/JCSSE58229.2023.10202157
Appears in Collections:	Scopus 2023

Files in This Item:

There are no files associated with this item.

Show simple item record