Departmental Bulletin Paper Real-time Voice Adaptation with Abstract Normalization and Sound-indexed Based Search

MIDTLYNG, Mads Alexander

11pp.1 - 6 , 2016-03-24 , 法政大学大学院情報科学研究科
This paper proposes a two-step system to conduct real-time voice adaptation in the field of speech processing. The first step includes recording and pre-processing to form a voice profile. Secondly is real-time input of the voice and adapting the input into a target voice. Concerning the fact that individual voices’ structure are habitually varying, this paper suggests a method for converting them into a comparable format. The new method is called abstract normalization which cuts the voice data into smaller sounds. From the sounds are generated an abstracted, simplified version of the data using a level of abstraction along with parameter fitting. The normalized data is used to generate a sound-index which consists of a sequence hash that represents the current object in a simpler fashion. The indices are used to compare different sounds/voices for adaptation. This effectively transforms the speech-related challenges into a search problem rather than a biometric one. To assess the approach, voice profile data are compared against each other as a method to verify the sound-index. Lastly a real-time voice input using alternating levels of abstraction is run against a voice profile created with Norwegian words. The degree of adaptation success is measured in percentage, and experimental results show that while accuracy is not yet excellent, the concept was validated.

Number of accesses :  

Other information