Conference Paper Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features

浅見, 太一  ,  Asami, Taichi  ,  増村, 亮  ,  MASUMURA, Ryo  ,  青野, 裕司  ,  AONO, Yushi  ,  篠田, 浩一  ,  Shinoda, Koichi

pp.1320 - 1324 , 2016-09 , ISCA
The repeated use of out-of-vocabulary (OOV) words in a spo-ken document seriously degrades a speech recognizer’s perfor-mance. This paper provides a novel method for accurately de-tecting such recurrent OOV words. Standard OOV word de-tection methods classify each word segment into in-vocabulary(IV) or OOV. This word-by-word classification tends to be af-fected by sudden vocal irregularities in spontaneous speech,triggering false alarms. To avoid this sensitivity to the irreg-ularities, our proposal focuses on consistency of the repeatedoccurrence of OOV words. The proposed method preliminar-ily detects recurrent segments, segments that contain the sameword, in a spoken document by open vocabulary spoken termdiscovery using a phoneme recognizer. If the recurrent seg-ments are OOV words, features for OOV detection in thosesegments should exhibit consistency. We capture this consis-tency by using the mean and variance (distribution) of features(DOF) derived from the recurrent segments, and use the DOFfor IV/OOV classification. Experiments illustrate that the pro-posed method’s use of the DOF significantly improves its per-formance in recurrent OOV word detection.Index Terms: speech recognition, OOV word detection, recur-rent OOV words, distribution of features

Number of accesses :  

Other information