The Role of Phonological Processes and Acoustic Confusability in Phone Errors in Children’s ASR
This paper examines the extent to which computer speech recognition errors for children’s speech can be attributed to common phonological effects associated with language acquisition.
September 6, 2016
WOCCI 2016
Authors
Eva Fringi (Disney Research/University of Birmingham)
Jill Fain Lehman (Disney Research)
Martin Russell (Disney Research/University of Birmingham)
The Role of Phonological Processes and Acoustic Confusability in Phone Errors in Children’s ASR
This paper examines the extent to which computer speech recognition errors for children’s speech can be attributed to common phonological effects associated with language acquisition. Recognition results are presented for three corpora of children’s speech, two comprising recordings of American English spoken by five- to nine-year-olds and one comprising recordings of British English speech from children aged five and six. The results are compared with adult reference confusion matrices based on TIMIT for the first two experiments and with confusion matrices for British adults and children with good speech for the third. They appear to be influenced by three factors: (i) confusions that are predictable from phonological factors associated with language acquisition also arise from acoustic confusability (e.g. /k/ -> /t/) , (ii) the frequency of the phonological errors is expected to decrease with increasing age, and (iii) an accurate recogniser is more likely to detect a phonological error when it occurs than a less accurate one. Overall the percentage of errors attributable to phonological processes remains approximately constant in each experiment. However, the proportion of these that differ significantly from reference patterns increases with recognition accuracy and is greater for children who are judged to have poor speech.