Psychology

A computational text analysis investigation of the relation between personal and linguistic agency

A. Simchon, B. Hadar, et al.

This research conducted by Almog Simchon, Britt Hadar, and Michael Gilead delves into how personal agency influences linguistic agency through computational text analysis. Discover how recalling power, social media interactions, and participation in support forums reveals a compelling link between personal and linguistic expression.

00:00

Playback language: English

Index

Introduction

The relationship between language and thought has long been a topic of interest. George Orwell's *1984* and his essay "Politics and the English Language" highlight how linguistic structures, particularly passive voice, can influence the perception of agency. Social thinkers have suggested a strong link between the use of passive language and the degree of personal agency, defined as the ability to exert control over oneself and the environment. This control is considered a fundamental human need, crucial for well-being. Psycholinguistic studies have shown that linguistic framing influences agency attribution in others, but the relationship between an individual's own sense of agency and their language use remains under-researched. Previous research, largely qualitative, suggested a link between passive language use and reduced personal agency in contexts like chronic pain and psychological therapy. Some quantitative studies examined passive language in relation to mental health conditions like OCD and traumatic experiences, finding correlations with reduced agency and heightened negative arousal. Anthropological studies further revealed a link between social power and agentive language use. This research aims to provide quantitative evidence for the relationship between personal and linguistic agency by examining how factors associated with personal agency (social power, social rank, and participation in a depression forum) are reflected in the use of passive voice.

Literature Review

Existing research suggests a link between linguistic framing and the attribution of agency to others. Studies have shown that using agentive language leads to increased blame and harsher punishment compared to non-agentive language. The encoding of agents in memory also varies across languages, highlighting the role of language in shaping agency perception. However, research on the relationship between an individual's personal sense of agency and their language use has been limited, mostly relying on qualitative analyses. Qualitative studies have observed passive voice usage in individuals dealing with chronic pain and psychological hardship, suggesting reduced personal agency. Quantitative studies focusing on psychopathology linked passive language to diminished agency in individuals with OCD and increased negative arousal during the recollection of traumatic events. Anthropological research indicated a correlation between social power and agentive language use in a Western Samoan village.

Methodology

The study comprises three studies using diverse data sources and analytical approaches. **Study 1:** This study re-analyzed data from a replication attempt of an existing study (Kasprzyk & Calin-Jageman, 2014) which manipulated participants' sense of power. 835 participants (from MTurk and Prolific) were assigned to high-power or low-power conditions and asked to write about relevant incidents. Passive voice was measured using spaCy, a natural language processing tool, validated by an independent rater's coding. Self-referential language was measured using LIWC. Contextualized Construct Representation (CCR) was used to measure sense of control, locus of control, and depression levels. **Study 2:** This study analyzed a large Twitter dataset (26.4 million tweets) to examine the relationship between the number of followers (as a proxy for social rank and power) and passive voice usage. Passive voice was extracted using spaCy, validated independently. CCR was used to analyze a subsample (100,000 tweets) for sense of control, locus of control, and depression. Data were aggregated by user to account for dependencies. **Study 3:** This study analyzed Reddit data to investigate whether the language used in the r/depression subreddit is less agentive than in other subreddits. Three sub-studies (3a, 3b, 3c) were conducted to address potential biases. Study 3a (N=8690) compared posts from r/depression with a random sample of 100 subreddits. Studies 3b (N=9685) and 3c (N=24,765) replicated Study 3a using older data and a more carefully selected set of control subreddits (generated using ChatGPT4 to ensure these subreddits primarily dealt with non-emotional support topics), respectively. Preprocessing in all studies involved removing links, emoticons, and irrelevant posts. Passive voice and CCR (sense of control, locus of control, and depression) were analyzed in all sub-studies. An independent rater validated the passive voice measure in each study. Negative binomial count models were used for primary analyses of passive voice, controlling for word count and text source (where applicable). Robust generalized linear regression and bootstrapping were employed in Study 2 due to the large size of the dataset and sensitivity to outliers.

Key Findings

**Study 1:** The low-power condition showed a 65% increase in passive voice compared to the high-power condition (IRR = 1.65, p < 0.001). The low-power condition also exhibited a 29% increase in self-referential language (IRR = 1.29, p < 0.001). **Study 2:** Each additional passive auxiliary verb predicted a 46% decrease in the number of followers on Twitter (IRR = 0.54, p < 0.001). Self-referential language was also negatively associated with followership. The relationship between passive voice and followership was moderated by self-referential language. CCR analysis showed consistent, but less robust results for sense of control, locus of control, and depression. **Study 3 (across 3a, 3b, 3c):** The language in the r/depression subreddit consistently showed a significant increase in passive voice compared to control groups. Study 3a found a 26% increase (IRR = 1.26, p < 0.001), Study 3b found a 42% increase (IRR = 1.42, p < 0.001), and Study 3c found a 16% increase (IRR = 1.16, p < 0.001) compared to the control groups, while support groups showed a 10% increase (IRR=1.10, p<0.001). The depression forum posts showed significantly more self-referential language than the control subreddits across the studies. CCR analyses consistently confirmed the expected pattern of results.

Discussion

The findings consistently demonstrate a strong relationship between personal agency and linguistic agency across diverse contexts. Study 1's experimental design provides causal evidence, showing that manipulating a sense of power directly influences the use of agentive language. Studies 2 and 3 provide evidence from naturalistic settings, linking social rank/influence (Study 2) and depression (Study 3) to linguistic agency. The negative association between passive voice and social media followership in Study 2 suggests that more agentive language might contribute to increased social influence. The increased passive voice usage in the depression forum (Study 3) supports the hypothesis that reduced personal agency due to depression is reflected in language use. The exploration of self-referential language reveals a consistent pattern across studies: lower personal agency is associated with higher self-referential language, replicating and extending previous findings linking self-referential language to depression.

Conclusion

This research provides comprehensive quantitative evidence for the link between personal and linguistic agency. The findings highlight how subtle linguistic variations reflect significant psychological and social factors. Future research should examine the stability of these relationships (trait vs. state) and investigate causality using non-correlational designs. Cross-cultural studies are needed to explore the generalizability of these findings. The methodological approach used in this study offers a valuable framework for future research on language and thought.

Limitations

The correlational nature of Studies 2 and 3 limits causal inferences. While Study 1 uses an experimental design, the other two studies only show associations, not causal relationships. The attribution of depression in Study 3 relies on forum participation, not clinical diagnoses. The study focuses on a single cultural-linguistic context, and future research should investigate cross-cultural variations in the relationship between personal and linguistic agency. The influence of factors beyond personal agency (e.g., social status, content quality) on linguistic agency in Study 2 and other aspects of depression in Study 3 should be considered.

Related Publications

Explore these studies to deepen your understanding of the subject.

Health and Fitness

Influence of social determinants of health in the evolution of the quality of life of older adults in Europe: A comparative analysis between men and women

R. Llorens-ortega, C. Bertran-noguer, et al.

Economics

Dynamic analysis of the relationship between exchange rates and oil prices: a comparison between oil exporting and oil importing countries

S. Chen, B. H. Chang, et al.

Economics

Sentiment and emotion in financial journalism: a corpus-based, cross-linguistic analysis of the effects of COVID

C. Vargas-sierra and M. Á. Orts

Interdisciplinary Studies

What is newsworthy about Covid-19? A corpus linguistic analysis of news values in reports by China Daily and The New York Times

S. Liu and H. Yu

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny