logo
ResearchBunny Logo
Applying speech technologies to assess verbal memory in patients with serious mental illness

Psychology

Applying speech technologies to assess verbal memory in patients with serious mental illness

T. B. Holmlund, C. Chandler, et al.

This groundbreaking study explores the development of a digital verbal memory test that leverages smart technology for efficient assessment in psychiatry. Conducted by researchers including Terje B. Holmlund and Chelsea Chandler, it showcases the potential of automated speech recognition and natural language processing to deliver reliable results comparable to traditional methods.

00:00
00:00
Playback language: English
Introduction
Verbal memory deficits are a hallmark of schizophrenia and other serious mental illnesses, significantly impacting cognitive function. Current assessment methods, such as those involving the Wechsler Memory Scale and Repeatable Battery for the Assessment of Neuropsychological Status, rely on trained personnel, are resource-intensive, and offer limited opportunities for frequent or longitudinal monitoring. These limitations hinder research progress and the development of personalized medicine approaches. This study aimed to leverage advancements in speech technology to develop a more efficient and accessible method for assessing verbal memory. The proposed method would address the limitations of existing tests by utilizing speech technology to automate the administration, transcription, and scoring of verbal memory tests. This approach allows for frequent monitoring of verbal memory, enabling a detailed examination of individual variability and the potential for the identification of new biomarkers or digital phenotypes. The researchers hypothesized that automated scoring, using both human transcriptions and ASR, would strongly correlate with human expert ratings, demonstrating the validity and reliability of the automated assessment method.
Literature Review
Existing research extensively documents verbal memory impairments in schizophrenia and other serious mental illnesses. Meta-analyses consistently reveal verbal declarative memory dysfunction as a core feature, with patients exhibiting disproportionately greater impairment in verbal relative to visual episodic memory. Verbal memory assessment is crucial in neuropsychological test batteries for schizophrenia and in evaluating interventions. However, traditional methods are limited by their reliance on trained personnel, cross-sectional administration, and restricted operationalization of memory (e.g., counting recalled items). The limitations of existing tools have hampered research into the daily fluctuations of verbal memory and its relationship to clinical states and treatments, hindering the development of personalized medicine strategies. Previous work has shown promise in using Latent Semantic Analysis (LSA) to analyze prose recall, but this study aimed to improve upon those methods by employing larger corpora and more advanced semantic analysis techniques such as Word Mover's Distance (WMD).
Methodology
This study involved 104 adults: 25 patients with serious mental illness (SMI) recruited from a group home, and 79 healthy undergraduate students. SMI patients met federal definitions of serious mental illness and underwent standardized clinical assessment, including the Structured Clinical Interview for DSM-IV-TR (SCID) and the Brief Psychiatric Rating Scale (BPRS). The verbal memory task involved retelling ten different stories (five narratives, five instructions) presented orally via a mobile device. Both immediate and delayed recall were assessed. The recordings were analyzed using three procedures: a fully human procedure (human transcription and rating); a hybrid procedure (human transcription and automated scoring using NLP); and a fully automated procedure (ASR transcription and automated scoring). For automated scoring, two features were extracted: the number of common word types and the Word Mover's Distance (WMD) between the original story and the participant's recall, using word embeddings from the Google News corpus. These features were used in a linear regression model to predict human ratings. Two ASR systems were used: a generic off-the-shelf system (Google's speech-to-text) and a custom-built system tailored to the task vocabulary. Word error rates were calculated for both ASR systems. The researchers used a 5-fold cross-validation procedure to evaluate the performance of the linear regression models.
Key Findings
The study demonstrated high tolerability of the task by patients. 92% of the 1035 speech responses were suitable for analysis (86% for patients, 96% for healthy controls). The average recall length was 61 words, significantly shorter for patients (48.7 words) than healthy controls (62.2 words). Human ratings of recall showed the expected pattern, with healthy participants receiving higher scores (mean = 4.6) than patients (mean = 3.3). The average inter-rater reliability among human raters was R=0.73. The correlation between human ratings and scores from the hybrid procedure (human transcription, automated NLP scoring) was R = 0.83, within the range of human-to-human correlation (R = 0.73-0.89). The fully automated procedure using the generic ASR system yielded a correlation of R = 0.82 with human ratings, and a correlation of R = 0.99 with scores derived from human transcriptions. The customized ASR system showed even lower error rates and equally high correlation with human ratings (R=0.82), strongly correlating with human transcription-based scores (R=0.96-0.99). A combined model using both common word counts and WMD achieved a correlation of R=0.83 with human ratings, comparable to the average human-to-human correlation.
Discussion
This study demonstrates the viability and robustness of a fully automated system for the frequent and efficient assessment of verbal memory. The high correlation between automated and human ratings suggests that speech technology can accurately capture verbal memory performance, even in diverse populations with varying levels of cognitive ability and speech clarity. The automated system offers significant advantages over traditional methods in terms of cost-effectiveness, scalability, and potential for longitudinal monitoring. The ability to assess verbal memory frequently outside of controlled laboratory settings opens up new opportunities for research on the dynamics of verbal memory in mental illness and facilitates the development of personalized interventions. The use of advanced semantic analysis techniques, such as WMD, allows for a more nuanced assessment that goes beyond simple word-count comparisons, capturing semantic similarity even when there's limited verbatim overlap.
Conclusion
This study successfully demonstrated the feasibility of leveraging speech technologies to assess verbal memory in patients with SMI. The developed digital test offers a cost-effective, scalable, and reliable alternative to traditional methods. Future research should focus on validating the clinical utility of frequent monitoring using this approach and exploring additional linguistic features (syntax, acoustic parameters) to improve accuracy and clinical relevance. The successful application of this method to other patient populations and languages needs to be investigated.
Limitations
The study sample size, while relatively large, may still limit the generalizability of the findings. The healthy control group consisted of undergraduate students, potentially differing from the patient population in terms of age and education level. The custom ASR system, although improving accuracy, requires additional development and adaptation for broader application. While the study addressed privacy concerns through careful data handling, further research on data security and ethical considerations for cloud-based ASR use in clinical settings is warranted.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny