The trough effect: Can we predict tongue lowering from acoustic data alone? Yolanda Vazquez Alvarez
Overview 1.Background on the ‘trough effect’ 2. Aim of this experiment 3. Experimental method & results 4. Acoustic-to-articulatory mapping 5. Conclusions
Background – The ‘trough effect’ • The ‘trough’ effect occurs in symmetrical VCV sequences and has been described as: ‘A Momentary deactivation of the tongue movement during the consonant closure’ (Bell-Berti, F. & Harris, K.; Gay, T)
Background – Acoustic evidence • Lindblom et al. (2002) collected direct measures of the F2 trajectories from symmetrical VCV utterances (V=/i/)
Background – Ultrasound evidence • Used QMUC’s data from the trough experiment • 3 Annotation points corresponding to 3 different tongue contours. • 2 Measurements of tongue displacement (MTD) were carried out for these 3 different contours
Background – Ultrasound evidence • MTDs were significantly different from each other for /iCi/ sequences (ipi (t (9) = -8.295, p< 0.010), ibi (t (9) = -9.774, p< 0.010)
Background – Advantages & disadvantages of both techniques • Acoustics: - Good time resolution - Doesn’t require specialised equipment to acquire the data • No visualisation of the tongue • Ultrasound: - Tongue contour visualization - Physical measurement • Need for frame-by-frame analysis of the tongue recording
Aim of this experiment Given the advantages of acoustic measurements: How confident can we be that the acoustic measurement of the tongue lowering gives us a true representation of the trough effect?
Experimental method Subjects 5 native speakers of English, various accents Data symmetrical VCV sequences: C=/p/,/b/ & V=/i/ Repetitions 2 reps, n=20 • Acoustic analysis • 4 F2 annotation points: • V1mid, V1offset and F2 onset, V2mid • 2 F2 measurements: • F2V1-C and F2C-V2 • Ultrasound analysis • 3 annotation points: V1mid, Cmid and V2sym • 2 distance measurements: V1-C and C-V2
Experiment - Results Correlation of F2 values and ultrasound data (V1-C) • Pearson correlation of V1-C and F2V1-C was significant (r (18)= .496, r2= 0.25, p<.05), predicting 25% of tongue lowering • variance. Using both F2 predictors • showed an increase in the • correlation coefficient for V1-C, • predicting a 43% of tongue • lowering variance. • Pearson correlation of C-V2 and F2C-V2 was not significant.
Experiment – Results • 3 possible reasons why we couldn’t predict the rise for C-V2: • Start of the tongue rise is in the closure so F2 can’t show information about its possible movement • The measuring point was mainly on the release for /p/ in the ultrasound data but we used V2mid because otherwise we wouldn’t have sufficient F2 data • Ultrasound time resolution may be too poor to capture the rising of the tongue at the appropriate moment
Acoustic-to-articulatory mapping • Korin Richmond et al. (2003) at CSTR, Edinburgh Univ., used a multilayer perceptron (MLP) neural network to estimate articulatory trajectories • The neural network was trained on articulatory data (EMA) and acoustic data where articulatory feature vectors (x,y) were normalised to lie in the range [0.1,0.9]
Acoustic-to-articulatory mapping • The MLP was applied to the acoustic data from the ultrasound experiment • Despite being trained on a different speaker, the trough phenomena could be observed in the MLP estimates for the y-coordinates of tongue body movement
MLP plot for /ibi/ v1mid V2 sym Cmid MLP plot for /ipi/ Acoustic-to-articulatory mapping • Annotation times from the ultrasound measurement points were used to compare the estimated tongue positions from the MLP • A tongue lowering and rising was observed in the MLP plots but no significant statistical results were obtained
Conclusions • Acoustic information (F2) may be missing for crucial articulatory movement. It is hard to map acoustic change into articulatory change • Current ultrasound time resolution can be too poor to provide information of rapid articulatory change • However, a combined approach can help improve both techniques
References • Bell-Berti, F. & Harris, K. (1974). More on the motor organization of speech gestures. Haskins Laboratories: Status Report on Speech Research SR-37/38, 73-77. • Gay, T. (1975). Some electromyographic measures of coarticulation in VCV-utterances. Haskins Laboratories: Status Report on Speech Research SR-44, 137-145. • Lindblom, B., Sussman, H., Modarressi, G. & Burlingame, E. (2002). The trough effect: Implications for motor programming, Phonetica, 59, 245-262. • K. Richmond, S. King, and P. Taylor. (2003). Modelling the uncertainty in recovering articulation from acoustics. Computer Speech and Language, 17:153-172.
Acknowledgements Thanks go to SHS at QMUC in Edinburgh for the use of the ultrasound data from the trough experiment. Also, I would like to thank Korin Richmond at CSTR in Edinburgh for his interest and help with the processing of the acoustic data using the MLP neural network.