logo
ResearchBunny Logo
High-frequency sound components of high-resolution audio are not detected in auditory sensory memory

Psychology

High-frequency sound components of high-resolution audio are not detected in auditory sensory memory

H. Nittono

Discover the intriguing findings of Hiroshi Nittono's research, which reveals that high-frequency sound components typical of high-resolution audio may not be processed distinctly in our auditory cortex. This study sheds light on the elusive perception of audio quality—could the superiority of high-resolution audio be more myth than reality?

00:00
00:00
Playback language: English
Introduction
High-resolution audio (HRA), exceeding the standard 44.1/48 kHz and 16-bit sampling rates of CDs and DVDs, is increasingly popular. A common belief is that HRA's superiority stems from its inclusion of high-frequency components exceeding the human audible range (approximately 20-20,000 Hz). However, whether these high frequencies contribute to a better listening experience is debated, with meta-analyses suggesting only slightly above-chance discrimination between HRA and standard audio. Previous research has reported a "hypersonic effect," where high-frequency components in music, presented through full-range systems, increased EEG alpha power. However, the mechanism underlying this effect, and whether it involves cortical processing of these inaudible frequencies, remains unclear. This study uses mismatch negativity (MMN), an EEG index reflecting auditory deviance detection in sensory memory, to investigate whether high-frequency components are processed differently at the cortical level.
Literature Review
The "hypersonic effect", an increase in EEG alpha power observed when listening to music rich in high-frequency components played through full-range systems, has been reported in previous studies. This effect was particularly notable with gamelan music and J.S. Bach's cembalo music. However, these studies found no reliable differences in subjective sound impressions or other psychophysiological measures (heart rate, skin conductance, facial EMG) between full-range and high-cut versions of the music. The mechanism remains unclear, with suggestions involving unknown vibrational information channels or distortions from digital audio processing. While some studies showed subcortical activation to high-frequency sounds, the conscious perception of these sounds, and therefore their cortical processing, remained questionable. This study aims to address this gap by using MMN to assess cortical processing.
Methodology
Thirty-eight young adults with normal hearing participated in a double-blind study. Auditory stimuli were 50-ms white noise bursts, digitally filtered to remove high-frequency components above either 11 kHz or 22 kHz (creating "11-kHz high-cut" and "22-kHz high-cut" sounds, respectively). The original unfiltered white noise burst served as the standard stimulus. MMN was recorded while participants watched a silent movie. The ABX test was used to assess behavioral discrimination between the original and high-cut sounds. High-resolution audio equipment (192 kHz/24-bit) was used to ensure high-fidelity sound reproduction. EEG data were collected from 34 scalp sites and analyzed using standard techniques. Ocular artifacts were corrected using Gratton's method, and online/offline filtering (1-30Hz) was applied. The MMN response was assessed at the frontocentral electrode cluster (Fz, FC1, FC2, and Cz), with the amplitude measured in a 40-ms period around the peak latency (120-160 ms). Statistical analysis included repeated measures ANOVA and Bayesian paired sample t-tests. Behavioral data were analyzed using the binomial distribution to assess discrimination accuracy.
Key Findings
The study found a significant MMN response to the 11-kHz high-cut sound (t(37) = 8.28, p < 0.001), indicating that it was perceived as a deviant stimulus. However, no significant MMN response was observed for the 22-kHz high-cut sound (t(37) = 1.34, p = 0.094), and the Bayesian analysis favored the null hypothesis. Behavioral results mirrored this: participants could discriminate the 11-kHz high-cut sound from the original (M = 99.3%), but not the 22-kHz high-cut sound (M = 52.6%, close to chance level). Exploratory analysis showed significant ERP differences between deviant and standard stimuli for 11-kHz high-cut sound across several time periods, but no such difference was observed for 22-kHz high-cut sound. No correlation was found between MMN amplitude and either ABX accuracy or auditory threshold. The results indicate that the removal of audible high-frequency components is detected at the cortical level, resulting in an MMN response, while the removal of inaudible high-frequency components is not.
Discussion
The findings suggest that the auditory sensory memory does not register the characteristics of high-resolution audio (high-frequency components and sharp onsets/offsets) at the cortical level. The lack of MMN response to the 22-kHz high-cut sound, even with high-resolution playback equipment, suggests that any advantage of HRA, if it exists, occurs subcortically and outside of conscious awareness. The different responses to the 11-kHz and 22-kHz high-cut conditions suggest a threshold for cortical processing of frequency changes; changes below this threshold (22 kHz) are not registered. The use of loudspeakers, which would allow for high-frequency energy to reach the listener through non-auditory pathways, does not change the outcome, meaning that such alternative pathways do not seem to be involved in the cortical responses.
Conclusion
This study demonstrates that the removal of inaudible high-frequency components from audio signals does not produce a detectable difference at the cortical level, as measured by MMN and behavioral discrimination. While the possibility of individual differences remains, for individuals with normal hearing, the broader bandwidth of HRA does not offer a conscious perceptual advantage over standard audio. Future research could investigate other aspects of HRA, such as quantization depth, and explore possible subcortical mechanisms for the hypersonic effect.
Limitations
The study focused on white noise bursts and may not generalize to other complex sounds. The use of high-resolution equipment may have minimized potential differences between audio formats. The sample size was limited to young adults with normal hearing; generalizability to other populations is unknown. Additionally, only specific aspects of HRA (sampling rate) were manipulated; other factors, such as bit depth, were not explored.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny