Linguistics and Languages

A practical guide to calculating vocal tract length and scale-invariant formant patterns

A. Anikin, S. Barreda, et al.

Explore the fascinating world of vocal tract length calculation and formant analysis with insights from Andrey Anikin, Santiago Barreda, and David Reby. This guide provides essential tools and theoretical frameworks that will transform your understanding of speech and non-speech vocalizations through robust statistical methods and practical software solutions.... show more

Abstract

Formants (vocal tract resonances) are increasingly analyzed not only by phoneticians in speech but also by behavioral scientists studying diverse phenomena such as acoustic size exaggeration and articulatory abilities of non-human animals. This often involves estimating vocal tract length acoustically and producing scale-invariant representations of formant patterns. We present a theoretical framework and practical tools for carrying out this work, including open-source software solutions included in R packages soundgen and phonTools. Automatic formant measurement with linear predictive coding is error-prone, but formant_app provides an integrated environment for formant annotation and correction with visual and auditory feedback. Once measured, formants can be normalized using a single recording (intrinsic methods) or multiple recordings from the same individual (extrinsic methods). Intrinsic speaker normalization can be as simple as taking formant ratios and calculating the geometric mean as a measure of overall scale. The regression method implemented in the function estimateVTL calculates the apparent vocal tract length assuming a single-tube model, while its residuals provide a scale-invariant vowel space based on how far each formant deviates from equal spacing (the schwa function). Extrinsic speaker normalization provides more accurate estimates of speaker- and vowel-specific scale factors by pooling information across recordings with simple averaging or mixed models, which we illustrate with example datasets and R code. The take-home messages are to record several calls or vowels per individual, measure at least three or four formants, check formant measurements manually, treat uncertain values as missing, and use the statistical tools best suited to each modeling context.

Publisher

Not specified in provided text

Published On

Nov 02, 2023

Authors

Andrey Anikin, Santiago Barreda, David Reby

DOI

https://doi.org/10.3758/s13428-023-02288-x

Related Publications

Explore these studies to deepen your understanding of the subject.

Medicine and Health

Measuring what matters in healthcare: a practical guide to psychometric principles and instrument development

K. Swan, R. Speyer, et al.

Biology

Warming, drought, and disturbances lead to shifts in functional composition: A millennial-scale analysis for Amazonian and Andean sites

M. T. V. D. Sande, M. B. Bush, et al.

Medicine and Health

Pulmonary expansion manoeuvres compared to usual care on ventilatory mechanics, oxygenation, length of mechanical ventilation and hospital stay, extubation, atelectasis, and mortality of patients in mechanical ventilation: A randomized clinical trial

K. D. Silva, C. Cristino, et al.

Medicine and Health

A social networks-driven approach to understand the unique alcohol mixing patterns of tuberculosis patients: reporting methods and findings from a high TB-burden setting

K. Nagarajan, B. Palani, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny