logo
ResearchBunny Logo
Timbral effects on consonance disentangle psychoacoustic mechanisms and suggest perceptual origins for musical scales

Psychology

Timbral effects on consonance disentangle psychoacoustic mechanisms and suggest perceptual origins for musical scales

R. Marjieh, P. M. C. Harrison, et al.

Explore the captivating influence of timbre on consonance perception in groundbreaking research by Raja Marjieh, Peter M. C. Harrison, Harin Lee, Fotini Deligiannaki, and Nori Jacoby. Their extensive studies reveal how timbral adjustments can alter our consonance preferences, challenging the boundaries of traditional auditory perception.... show more
Introduction

The study investigates whether musical consonance perception depends on the timbre of chord tones, challenging the traditional Western-theoretical view that consonance is determined by simple frequency ratios and is timbre-invariant. The authors frame consonance as central to scale construction, tuning, and chord formation across musical cultures. They highlight two leading psychoacoustic explanations—interference (roughness from partial interactions) and harmonicity (alignment to harmonic series)—and note cultural contributions from musical exposure. Prior work offered mixed conclusions regarding timbral effects, with theory suggesting timbre-dependence and empirical studies often failing to find qualitative changes. The authors pose the research question: Do systematic timbral manipulations reshape consonance preferences, and what does this reveal about underlying psychoacoustic mechanisms and cultural scale evolution? They propose large-scale continuous-interval behavioral measurements and concurrent computational modeling to disentangle mechanisms and explore perceptual bases for diverse tuning systems.

Literature Review

Historical and theoretical perspectives trace consonance to simple integer ratios (Pythagoras, Rameau, Euler) and modern accounts emphasizing psychoacoustic and cultural factors. Two dominant psychoacoustic theories prevail: (1) interference between partials producing roughness/fast beats (Helmholtz; Plomp & Levelt; Hutchinson & Knopoff) and (2) harmonicity detection via template matching or autocorrelation (Terhardt; Harrison & Pearce; Milne). Traditional Western theory treats consonance as timbre-independent, yet interference theories predict strong dependence on spectral content (positions and amplitudes of upper harmonics). Empirical findings have been mixed: early citations of timbral effects have not consistently replicated with modern methods; many recent studies suggest limited qualitative impact of timbre on consonance when using discrete chromatic intervals. Other relevant literature includes cultural variation (e.g., Tsimane' indifference to dissonance), effects of musical expertise, and prior modeling evaluations showing strong performance for interference and harmonicity models. Sethares theorized links between instrument spectra and scale structures (e.g., gamelan slendro), but large-scale empirical tests with continuous intervals and systematic timbral manipulations were lacking.

Methodology

Design: 23 online behavioral experiments (4272 participants; 235,440 judgments) across US (AMT) and South Korea (local recruitment), using two paradigms: dense rating (continuous dyadic intervals) and Gibbs Sampling with People (GSP) for triads. Stimuli and manipulations: Additive synthesis of tones with precise spectral control. Baselines used harmonic complex tones (10 harmonics, typical roll-offs). Systematic timbral manipulations included: (1) changing harmonic frequencies (spectral stretching/compression; inharmonic instrument-inspired spectra, e.g., bonang), (2) changing harmonic amplitudes (spectral roll-off from 0–15 dB/octave), and (3) deleting individual harmonics (e.g., remove 3rd harmonic; pure tones). Naturalistic instrument approximations (flute, guitar, piano) were included for comparison. Dyads spanned 0–15 semitones; triads explored a 2D interval space via GSP. Bass note randomized (MIDI 55–65). Stimuli rendered via Tone.js with controlled ADSR envelopes. Procedures: Dense rating: participants rated pleasantness (1–7) for randomly sampled intervals; data summarized using Gaussian kernel smoothing (bandwidth 0.2 semitones for 0–15 ST; bootstrapped 95% CIs, peak-picking). GSP: participants collaboratively optimized triad pleasantness along alternating interval dimensions; results summarized via kernel density estimation (bandwidth 0.375; split-half reliability assessed). Headphone screening, training, and small performance-based bonuses were used; demographic questionnaire collected musical experience. Modeling: Compared interference models (notably Hutchinson–Knopoff) and harmonicity models (Harrison–Pearce; Milne; Boersma/autocorrelation). Predictions were generated for each manipulation to dissociate mechanisms. A composite model averaged updated interference and harmonicity components. Interference model modifications were motivated by data: (a) reduced amplitude exponent r≈1.359 (less sensitivity to roll-off), and (b) inclusion of a preference for slow beats (modifying dissonance kernel at small critical bandwidth distances). Reliability: Split-half correlations high (dyads r≈0.87; triads r≈0.93).

Key Findings
  • Baseline dyads (Study 1A, N=198): Pleasantness peaks closely align with 12-tone chromatic integer semitones (mean distance 0.05 ST, 95% CI [0.03, 0.08]; chance 0.25 ST). Consonant vs dissonant categories differ by 0.38 SD (95% CI [0.29, 0.46]). Strong correlations with prior datasets (e.g., r=0.96, r=0.91, r=0.94). High internal reliability (split-half r=0.87 [0.74, 0.94]).
  • Naturalistic instruments (Study 1B, flute N=190, guitar N=210, piano N=198): Broadly similar profiles to harmonic complexes (mean r≈0.56), with some instrument-specific peak differences.
  • Musicianship effects: More experienced participants showed greater rating differentiation, but profiles correlated across groups (mean ρ≈0.68 [0.57, 0.80]).
  • Spectral stretching/compression (Study 2A: stretched N=194; compressed N=202; baseline N=198): Consonance profiles shift correspondingly. Stretched octave peak at 12.78 ST (95% CI [12.68, 12.88]) vs harmonic 12.04 [11.97, 12.11]. Compression yielded analogous shifts. Effect replicated with Korean participants (Study 2B; N=68) showing similar stretching/compression of preferences.
  • Inharmonic bonang mixture (Study 2C, N=170): Dyads of harmonic lower tone plus inharmonic bonang-like upper tone produced peaks at 2.60 [2.51, 2.67], 4.80 [4.70, 4.95], and an octave at 11.98 [11.88, 12.05]. No peaks at major third (3.9–4.5 ST) and often none at perfect fifth (6.5–7.5 ST). Behavioral peaks align with slendro degrees (≈2.4, 4.8, 12 ST), supporting spectrum–scale linkage.
  • Spectral roll-off (Study 3, N=322): Strong main effect—greater roll-off yields higher overall pleasantness (β≈0.89). However, profile differentiation remains; a GAM with main effects of roll-off and interval explains ≈98% of smoothed rating variance, indicating minimal shape change. Contradicts interference predictions of flattening; supports harmonicity. Led to interference model update (reduced amplitude exponent) in the composite model.
  • Harmonic deletion (Study 4A, N=485: pure N=176; no 3rd N=160; 5 equal harmonics N=149): Deleting upper harmonics progressively removes peaks (full spectrum: 7 peaks including m3, M3, P4, P5, M6, 8ve, M10; removing 3rd eliminates m3; pure tones retain only P5 and 8ve peaks). Harmonicity predicts certain eliminations (e.g., m3 when 3rd harmonic is removed), while interference alone fails to predict pure-tone peaks.
  • Fine-grained tuning preferences (Study 4B, N=1341): For complex tones, listeners prefer slight deviations around just intonation: major sixth peaks at 8.78 [8.77, 8.80] and 8.93 [8.92, 8.94]; octave peaks at 11.94 [11.92, 11.96] and 12.08 [12.07, 12.10]; major third peak at 3.95 [3.93, 3.96] (a flat-side peak detected in 66% bootstraps). These preferences vanish with pure tones (flatter curves), consistent with enjoyment of slow beats from upper harmonics. Implementing slow-beat liking in the interference kernel allowed the composite model to capture these effects.
  • Triads via GSP (Study 5A, N=228): Consonance hotspots align with major triad inversions ([4,3], [3,5], [5,4]) and the octave diagonal; correlations with reference datasets r(52)=0.73 (p<0.001); internal reliability r=0.93 [0.91, 0.96].
  • Triad stretching/compression (Study 5B: stretched N=229; compressed N=233): Octave diagonal shifts: harmonic 12.04 [11.87, 12.21], stretched 12.80 [12.76, 12.83], compressed 11.20 [11.15, 11.25], mirroring dyadic results; predicted by interference but not standard harmonicity. Overall: Timbral manipulations can induce inharmonic consonance preferences, dissociate mechanisms (interference vs harmonicity), and suggest perceptual underpinnings for cultural scale and tuning systems. A composite model combining harmonicity, dislike of fast beats (roughness), and liking of slow beats best fits the full pattern.
Discussion

The findings demonstrate that consonance judgments are not timbre-invariant: changing harmonic frequencies, amplitudes, or presence of specific partials systematically reshapes pleasantness profiles. Stretching/compressing spectra shifts preferred intervals accordingly, a result accounted for by interference models and inconsistent with standard harmonicity-only accounts. Conversely, increasing spectral roll-off preserves profile shape (supporting harmonicity) while increasing overall pleasantness, contradicting interference predictions of flattening. Harmonic deletion removes specific peaks in ways more consistent with harmonicity, yet fine-grained tuning preferences around just intonation reveal a positive valuation of slow beats, reconcilable by modifying interference kernels. Together, the results reject unitary explanations, instead supporting a composite account: consonance in Western listeners reflects jointly positive harmonicity, negative fast-beat roughness, and positive slow-beat beating. Culturally, the inharmonic bonang case aligns behavioral peaks with slendro scale degrees, empirically underpinning hypotheses that instrument spectra shape scale evolution. Preferences for slight mistunings suggest that the subtle impurities in historical Western tunings (mean-tone, equal temperament) can sometimes enhance pleasantness via slow beats. The approach—continuous-interval measurement plus modeling—provides a robust framework to probe mechanisms, yielding high reliability and concordance with prior laboratory findings.

Conclusion

This work establishes that timbre strongly influences consonance perception and that distinct spectral manipulations disentangle underlying psychoacoustic mechanisms. Across 23 experiments, the authors show: (i) spectral stretching/compression shifts preferred intervals; (ii) inharmonic instrument-like spectra can induce inharmonic consonance peaks aligning with non-Western scales; (iii) spectral roll-off modulates overall pleasantness without flattening profiles; (iv) deleting harmonics removes specific consonance peaks; and (v) listeners often prefer slight deviations from just intonation for tones rich in upper harmonics due to pleasant slow beats. A composite computational model combining harmonicity with revised interference (reduced amplitude exponent and slow-beat preference) accounts for results not captured by either mechanism alone. Future directions include broader cross-cultural studies, analyses at individual participant level with more trials per person, dichotic vs diotic presentation to isolate interference, experiments with more naturalistic instruments (e.g., pipe organs, bells), temporal aspects of consonance in musical contexts, and probing additional evaluative dimensions beyond pleasantness.

Limitations
  • Online data collection limits control over listening environments despite headphone screening; perfect dichotic presentation is difficult online.
  • Cross-cultural generalization tested only with a Korean cohort that likely has exposure to Western music; broader cultural sampling is needed.
  • Participant-level reliability was low due to large stimulus spaces and few trials per participant, limiting individual-differences analyses.
  • Stimuli were largely artificial, additively synthesized tones; generalization to fully naturalistic instrument sounds requires further work (ongoing for organs and bells).
  • Focus on isolated chords and the pleasantness construct; temporal musical context and other evaluative dimensions were not directly studied here.
  • Historic participation across multiple experiments was not strictly controlled at the global level (though only once per experiment), potentially allowing repeated participants across different experiments.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny