• An age-dependent vocal tract model for males and females based on anatomic measurements

      Story, Brad H.; Vorperian, Houri K.; Bunton, Kate; Durtschi, Reid B.; Univ Arizona, Speech Language & Hearing Sci (ACOUSTICAL SOC AMER AMER INST PHYSICS, 2018-05)
      The purpose of this study was to take a first step toward constructing a developmental and sex-specific version of a parametric vocal tract area function model representative of male and female vocal tracts ranging in age from infancy to 12 yrs, as well as adults. Anatomic measurements collected from a large imaging database of male and female children and adults provided the dataset from which length warping and cross-dimension scaling functions were derived, and applied to the adult-based vocal tract model to project it backward along an age continuum. The resulting model was assessed qualitatively by projecting hypothetical vocal tract shapes onto midsagittal images from the cohort of children, and quantitatively by comparison of formant frequencies produced by the model to those reported in the literature. An additional validation of modeled vocal tract shapes was made possible by comparison to cross-sectional area measurements obtained for children and adults using acoustic pharyngometry. This initial attempt to generate a sex-specific developmental vocal tract model paves a path to study the relation of vocal tract dimensions to documented prepubertal acoustic differences. (C) 2018 Acoustical Society of America.
    • Effects of sampling rate and type of anti-aliasing filter on linear-predictive estimates of formant frequencies in men, women, and children

      Milenkovic, Paul H.; Wagner, Madison; Kent, Raymond D.; Story, Brad H.; Vorperian, Houri K.; Univ Arizona, Speech Language & Hearing Sci (ACOUSTICAL SOC AMER AMER INST PHYSICS, 2020-03-04)
      The purpose of this study was to assess the effect of downsampling the acoustic signal on the accuracy of linear-predictive (LPC) formant estimation. Based on speech produced by men, women, and children, the first four formant frequencies were estimated at sampling rates of 48, 16, and 10 kHz using different anti-alias filtering. With proper selection of number of LPC coefficients, anti-alias filter and between-frame averaging, results suggest that accuracy is not improved by rates substantially below 48 kHz. Any downsampling should not go below 16 kHz with a filter cut-off centered at 8 kHz. (C) 2020 Acoustical Society of America
    • A model of speech production based on the acoustic relativity of the vocal tract

      Story, Brad H; Bunton, Kate; Univ Arizona, Speech Language & Hearing Sci (ACOUSTICAL SOC AMER AMER INST PHYSICS, 2019-10-17)
      A model is described in which the effects of articulatory movements to produce speech are generated by specifying relative acoustic events along a time axis. These events consist of directional changes of the vocal tract resonance frequencies that, when associated with a temporal event function, are transformed via acoustic sensitivity functions, into time-varying modulations of the vocal tract shape. Because the time course of the events may be considerably overlapped in time, coarticulatory effects are automatically generated. Production of sentence-level speech with the model is demonstrated with audio samples and vocal tract animations. (C) 2019 Acoustical Society of America.