logo
ResearchBunny Logo
Why do Mandarin speakers code-switch? A case study of conversational code-switching in China

Linguistics and Languages

Why do Mandarin speakers code-switch? A case study of conversational code-switching in China

X. Zhong, L. H. Ang, et al.

Dive into the intricate world of code-switching among Mandarin speakers in Mainland China! This fascinating study by Xinyi Zhong, Lay Hoon Ang, and Sharon Sharmini reveals the prevalent patterns of code-switching and the key influencing factors, highlighting an early stage of multilingual development in the community. Explore the conversational analysis that uncovers insights into how language varieties interact online.... show more
Introduction

Global economic and technological globalization has accelerated multilingual interactions. China has vast linguistic diversity (over 300 languages), with Mandarin as the official and dominant language across government, education, and media, while numerous regional dialects and minority/foreign languages are also used. Whether China is a multilingual society remains debated. Prior CS research in the Chinese context has focused more on Hong Kong, Macao, and Taiwan, with limited attention to Mandarin-dominated speakers in mainland China. This study addresses that gap by examining conversational code-switching among Mandarin-dominated mainland Chinese speakers to gauge multilingual practices and development. It applies Muysken’s typology (insertion, alternation, congruent lexicalisation, backflagging) to identify CS patterns and Ritchie and Bhatia’s model to explain factors influencing CS, aiming to clarify the nature and stage of multilingualism in mainland China.

Literature Review

CS is commonly defined as the use of items and grammatical features from two languages within a sentence (Muysken, 2000) and has been studied from syntactic and sociolinguistic perspectives. Poplack’s trichotomy (intra-/inter-sentential, tag switching) focuses on positional placement, whereas Muysken’s typology emphasizes grammatical structures: insertion, alternation, congruent lexicalisation, and later backflagging. Empirical support varies: some contexts show predominance of insertion (e.g., Luxembourg), others alternation (e.g., USA), and some confirm the applicability of all patterns (e.g., South Africa, Lebanon). Revisions and critiques note ambiguities between insertion and alternation and propose refinements (e.g., mid-step patterns, ‘ragged’ mixing among low-fluency bilinguals). Sociolinguistic accounts of CS drivers span micro and macro levels. Classic models (Fishman; Gumperz; Myers-Scotton) emphasize participant, situation, topic, and social markedness. Ritchie and Bhatia (2013) synthesize factors into: participant/social roles and relationships; situational factors; message-intrinsic considerations (e.g., clarification, hedging, idioms, interjections, quotations, repetition); and sociopsychological factors (attitudes, dominance, accommodation). The current study adopts Muysken’s typology for patterning and Ritchie and Bhatia’s framework, augmented with a context-oriented account and the convenience factor, to explain CS in China.

Methodology

Design: Qualitative case study using conversational analysis of short, naturally occurring online video interactions. Participants: Sixteen Mandarin-dominated mainland Chinese speakers (5 males, 11 females), ages 22–32 (mean 24.75), recruited via WeChat on 09/10/2021. Inclusion: Mandarin-dominated mainland Chinese speakers. Exclusion: non-Mandarin-dominated speakers. All self-reported Mandarin as dominant; no speech/hearing impairments. Data sources: Short videos posted on Bilibili and Sina Weibo. Initial selection: Each participant proposed three videos under strict criteria (spontaneous multi-party conversations; Mandarin-dominated CS; 5–30 minutes; representative speech styles; excluding monologues, scripted media, vulgar/sensitive content). Researchers screened submissions; a voting process among participants narrowed 48 videos to a final corpus of 16. Video characteristics: Duration range 7m18s–27m02s (total 3h54m15s; mean 14m09s). Topics included beauty and makeup, casual conversation, digital products, food, healthcare, marriage, and workplace. Procedures: All 16 videos were transcribed and translated. Language varieties were identified and coded; frequencies computed. CS patterns were analyzed using Muysken’s (2000, 2013) typology (insertion, alternation, congruent lexicalisation, backflagging). Influencing factors were examined via Ritchie and Bhatia’s (2013) model (participants’ social roles and relationships; situational factors; message-intrinsic considerations; sociopsychological factors) with an additional factor of convenience. Coding and reliability: Two independent academic raters coded using ATLAS.ti. Inter-rater reliability assessed with Cohen’s Kappa = 0.89 (substantial agreement). Discrepancies resolved by consensus.

Key Findings

Language variety distribution: Of 64,100 total words, Mandarin accounted for 63,294 (98.743%); non-Mandarin 806 (1.257%). Non-Mandarin breakdown (of 806): foreign languages 756 (93.797%)—English 750 (93.052%), Japanese 5 (0.621%), French 1 (0.124%); dialects 50 (6.203%)—Beijing 24 (2.978%), Northeastern 6 (0.744%), Sichuan 20 (2.481%). Code-switching patterns (of 806 CS words): insertion 713 (88.461%); backflagging 66 (8.189%); alternation 27 (3.350%); congruent lexicalisation 0. Insertion subtypes: word 406 (50.372%), phrase 125 (15.509%), discourse 90 (11.166%), letter 51 (6.327%), morpheme 25 (3.102%), clause 16 (1.985%). Insertion primarily reflected English elements within Mandarin matrix sentences. Backflagging involved embedded-language grammar (e.g., English utterances) within Mandarin-dominant discourse. Alternation was infrequent, consistent with lower balanced proficiency across languages. Factors influencing CS (counts over 806 CS words): participants’ social roles and relationships 331 (41.067%)—primarily dual/multiple identities 316 (39.206%); situational factors 2 (0.248%)—discourse topic; message-intrinsic considerations 257 (31.886%)—hedging 63 (7.816%), idioms/cultural wisdom 74 (9.181%), interjection 8 (0.993%), paraphrasing/reiteration 23 (2.854%), repetition 41 (5.087%), quotation 48 (5.955%); sociopsychological factors 165 (20.471%)—language dominance; convenience 51 (6.328%). Additional observations: English predominance among foreign CS likely reflects education policy prioritizing English since 1987 and school curriculum reforms from 2001. Dialect usage aligns with policy shifts toward dialect preservation and informal/private domain use. The dominance of insertion and personal/identity-related factors suggests early-stage, non-balanced multilingualism.

Discussion

Mandarin overwhelmingly dominated discourse, reflecting historical and contemporary language policies that elevated Mandarin’s status and usage. Non-Mandarin CS occurred mainly with English and, to a lesser extent, regional dialects, consistent with educational emphasis on English and a policy environment that both standardizes Mandarin and supports dialectal heritage. The clear predominance of insertion (especially single-word insertions) alongside backflagging, and the rarity of alternation with no congruent lexicalisation, indicate non-balanced multilingualism among Mandarin-dominant speakers: speakers rely on a single matrix grammar (Mandarin) and insert lexical items from other codes, rather than maintaining simultaneous grammatical structures of multiple languages. Factors analysis shows participant-related considerations—identity signaling, addressee accommodation, and role alignment—as the primary triggers, followed by message-intrinsic needs (hedging, idioms, interjections, paraphrasing, repetition, quotations), language dominance, convenience, and minimal situational topical triggers. These patterns corroborate prior findings that insertion is common when proficiency in the embedded language is limited and that personal/identity dynamics strongly shape CS. The absence of congruent lexicalisation and low alternation frequency further underscore limited cross-code grammatical mastery and structural congruence in this speech community. Overall, results support the applicability of Muysken’s typology and Ritchie and Bhatia’s framework (expanded by convenience) in the mainland Chinese context and suggest that Mandarin-speaking communities are in an early phase of multilingual development.

Conclusion

The study shows that Mandarin-dominated mainland Chinese speakers predominantly employ insertional CS, with English as the principal non-Mandarin code and regional dialects appearing to a lesser extent. Three CS patterns were identified—insertion (over 88% of CS), backflagging (about 8%), and alternation (about 3%)—while congruent lexicalisation was absent. Insertion chiefly involved single-word items, indicating reliance on a single grammatical frame and non-balanced multilingual competence. Participant-related factors were the primary drivers of CS, followed by message-intrinsic considerations, sociopsychological factors (language dominance), convenience, and situational factors. These patterns suggest that China’s Mandarin-speaking community is in a formative stage of multilingualism. Future research should include non–Mandarin-dominant speakers, larger and more diverse samples, and broader contextual and psychological variables to enhance generalizability and deepen understanding of multilingual language practices, policy implications, and social interactions.

Limitations

The study focuses on Mandarin-dominated speakers and a relatively small sample (16 videos), which may not comprehensively represent diverse CS practices across contexts. The methodology—curating existing online short videos instead of recording spontaneous conversations—limits ecological variety, even as it facilitates data collection. The sample size may be insufficient to thoroughly capture less frequent patterns (alternation, congruent lexicalisation). Additional macro-level sociocultural dynamics and micro-level psychological attitudes contributing to the observed early-stage multilingualism were not examined.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny