Part four: Probabilistic “Sliding template” models for indirect vowel normalization Ch 16.3~4
16.3.Normalization with a latent
16.3.1 Method 2: Unrestricted (or uniform prior) optimization over
Need the tokens from all of a speaker’s vowel categories to estimate
Several method
Related to methods of vocaltract length normalization
Spectral scaling
Spectral warping in the ASR literature(Westphal 1997)
The intermediate expression Pv(2) as an implicit shorthand for
Fuller expression
Arg max [P(Gt︱v, Ψs)]
Both Equation (11) and Equation (12)
Choose the vowel that looks best when it tries to look best
Equation (11)
Each vowel, the template us slide along a track on the table
Equation (12)
The vowel patch with the greatest light transmission
The scheme imposed no restrictions or preference onΨs
16.3.2 Method 3: Optimization of with an informative prior
Evidence
Chiba and kijiyama(1958)
Adult male Japanese vowels
Fu and Shannon(1999)
Identification accuracy follows an inverted U-shaped function.
Statistical modeling of perception
Extend the sliding template analogy to incorporate the prior probability.
Argumentation with equation (11)
Implicit prior distribution of Ψs
16.4 Estimation of Ψsinformed by g0
16.4.1 Method 4: One-shot plug-in substitution for Ψs
Equation(14)
Empirical correlation between speakers’ g0 ranges
Their format ranges
Figure 16.3
Information for the three datasets pooled
Equation(15)
Modify equation (7)
Closely related that of Miller(1989)
16.4.2 Method 5: MAP modulated by conditional probability of Ψs given g0
Ignores the error inherent in estimation of Ψs from g0
Equation(16)
16.4.3 Method 6: Joint maximization of P(v|Gt,Ψs ) and P (Ψs, g0t)
Represent the conditional probability ofΨs, given g0t via regression
Equation (21)
Is a complete expression for an estimate of the posterior mode ofΨs
In method 5 and 6
Provide the very convenient ways to
Modulate the effects of g0 on vowel identification parametrically
沒有留言:
張貼留言