logo
ResearchBunny Logo
A corpus-based interpretation of the discourse-cognitive-society triangle on Chinese court judgments

Law

A corpus-based interpretation of the discourse-cognitive-society triangle on Chinese court judgments

J. Wu, L. Cheng, et al.

Discover how a team of researchers, including Jingjing Wu, Le Cheng, and Yi Yang, delves into the world of Chinese court judgments using a unique corpus-based approach. Their findings uncover the complex cognitive and social dimensions that shape judicial decisions, offering crucial insights into legal text interpretation and decision-making in China.... show more
Introduction

The study is situated within Critical Discourse Analysis (CDA) and van Dijk’s socio-cognitive approach (SCA), which views discourse as a form of social interaction mediated by cognition. Prior CDA/SCA work emphasizes how psychological representations and shared knowledge shape discourse and social practice. Against this background, the paper asks how Chinese court judgments can be described and interpreted through the interplay among discourse components (vocabulary, phrases, sentences), cognitive sources (faith, induction, paraphrase, inference), and social functions (citation, depiction, distance, summary). The purpose is to map and quantify links among linguistic form, cognitive sourcing, and social function in judicial texts, thereby clarifying how legal reasoning and authority are constructed in Chinese judgments. The study’s importance lies in making non-obvious socio-cognitive mechanisms empirically observable, informing interpretation and decision-making in Chinese courts.

Literature Review

The literature frames law as a social value system realized through language, with CDA focusing on power, control, and ideological reproduction but having comparatively less attention to legal language. Comparative and genre-based studies show court judgments’ discourse styles vary with judicial roles and social structures, exhibiting intertextuality and distinctive message distribution. Prior research on Chinese judgments addresses rhetorical structure, modality, evidentiality, authorial voice, reporting verbs, and computational prediction. SCA positions social cognition at macro- and micro-levels, enabling analysis of legal texts’ deep rules by linking discourse features to shared knowledge and ideology. This study extends these lines by integrating corpus linguistics with SCA to empirically connect discourse components, cognitive sources, and social functions in Chinese court judgments.

Methodology

Design: Corpus-based CDA within SCA. Data: A 1.74-million-word corpus of Chinese court judgments (CCJ) drawn from the Supreme People’s Court (SPC) and local courts (China Court Network/China Judgments Online), comprising 54 civil, 57 criminal, and 5 administrative judgments. Tools and procedures:

  • Automatic semantic tagging and retrieval using Wmatrix 3.0 (USAS tagset) to identify expressions indicating cognitive sources and social functions.
  • Frequency lists and concordances via ConcGram 1.0; supplemented by manual labeling and calculation.
  • Variables: Discourse components at three levels—vocabulary (speech, sensory, cognitive, modal verbs; adverbs categorized by confidence: high/medium/low), phrases (prepositional phrases, verb–object phrases), sentences (causal and conditional clauses). Cognitive sources: faith (laws, regulations, evidence), induction, paraphrase, inference. Social functions: citation, depiction, distance, summary. Key variables and definitions are summarized and operationalized from prior work.
  • Statistical models: Linear regression models (Stata 16) link social functions (dependent variables) to cognitive sources and discourse components (independent variables) (Models 1–4). Further models examine faith’s sub-sources—law, regulation, evidence—and discourse components for citation (Models 5–10), and fine-grained discourse structures (Fps+V, Tps+V, Ups+V; H-/M-/L-confidence adverbs; Pre-P, Ver-P; Con-C, Cau-C) predicting social functions and cognitive sources (Models 11–18).
  • Multicollinearity assessed by VIF (SPSS 23), with all VIFs < 10. Robustness: Tobit models confirm significance patterns. Interpretive framework: van Dijk’s Discourse–Cognition–Society triangle guides linking linguistic form to cognitive sourcing and social function in judgments.
Key Findings
  • Three overarching findings: (1) The discourse dimension of Chinese judgments is both society-oriented and cognition-oriented; specific discourse components mark cognitive sources. (2) Faith (law, regulation, evidence) belongs to social cognition; induction and paraphrase reflect personal cognition; inference can transform personal cognition into social consensus via reasoning. (3) Social functions correspond to cognitive sources and are realized through surface discourse structures.
  • Regression highlights (Models 1–4):
    • Faith positively predicts citation (Cit); phrases (Phr) also positively affect Cit.
    • Induction (Ind) and paraphrase (Par) positively predict depiction (Dep).
    • Par positively predicts distance (Dis).
    • Inference (Inf) positively predicts summary (Sum); phrases (Phr) and sentences (Sen) also positively affect Sum.
    • No multicollinearity (all VIFs < 10); Tobit robustness yields consistent significance.
  • Faith sub-sources and citation (Models 5–10): Law, regulation, and evidence (Evi) each significantly and positively predict citation; phrases and sentences often co-occur in citations. In detailed models, Pre-P and Cau-C are linked to citations of law and evidence, respectively.
  • Fine-grained discourse structures (Models 11–18):
    • Citation: Prepositional phrases (Pre-P) and causal clauses (Cau-C) positively predict Cit.
    • Depiction: First-person subject+verb (Fps+V) and high-confidence adverbs (H-ad) positively predict Dep; unknown-subject constructions (Ups+V) negatively affect Dep.
    • Distance: Third-person subject+verb (Tps+V) positively predicts Dis; conditional clauses (Con-C) negatively affect Dis.
    • Summary: Pre-P, verb–object phrases (Ver-P), and Cau-C positively predict Sum.
    • Cognitive sources: Pre-P and Ver-P positively predict faith (Fai); Fps+V and H-ad positively predict induction; Tps+V, Ups+V, H-ad, and M-ad positively predict paraphrase; Pre-P, Ver-P, Con-C, and Cau-C positively predict inference.
  • Distributional/descriptive statistics:
    • Subject-verb structures: Tps+V 90.88%, Fps+V 3.64%, Ups+V 5.48% of verb constructs conveying socio-cognitive meaning.
    • Adverbs: high-confidence items (e.g., ken ding ‘certainly’) lead among evidential adverbs.
    • Phrases: Pre-P (e.g., genju/anzhao ‘in accordance with’) 5004 tokens; Ver-P (e.g., ‘identify the fact that…’) 3880 tokens; Pre-P ≈55.26% and Ver-P ≈44.74% of socio-cognitive phrases.
    • Clauses: Causal clauses (yin/gu/yinci) ≈87.49% vs conditional (ruguo) ≈12.51% of socio-cognitive clauses.
    • Proportions of discourse devoted to cognitive sources: Faith 22.12%; Induction 1.59%; Paraphrase 10.12%; Inference 11.71%. Within faith-related content: law ≈4.81% of total words; regulation ≈1.6%; evidence ≈15.68%.
  • Qualitative exemplification aligns with quantitative patterns: speech verbs mark paraphrase; sensory verbs combined with first-person subjects mark induction; modal verbs and Pre-P support inference and faith; causal/conditional clauses structure inferential reasoning and procedural guidance.
Discussion

The findings empirically demonstrate the SCA linkage between linguistic form, cognitive sourcing, and social function in Chinese court judgments. Discourse components systematically signal underlying cognitive sources: citations rely on socially shared knowledge (faith) expressed through legal references and evidence; depictions draw on personal experience (induction) and reported accounts (paraphrase); distance is achieved through paraphrastic marking that separates judicial voice from quoted voices; summaries crystallize inferential reasoning using causal and conditional structures. These patterned relationships clarify how judges build authority, manage stance and responsibility, and translate personal and intertextual inputs into socially validated conclusions. The results show how personal cognition (induction/paraphrase) is integrated and, via inference, aligned with social cognition (faith), thereby producing decisions that both reflect and reproduce legal-social norms. This advances legal discourse interpretation by making the non-obvious socio-cognitive interface measurable and reproducible across a large corpus.

Conclusion

The study contributes a corpus-based socio-cognitive account of Chinese court judgments, showing: (1) discourse components at lexical, phrasal, and clausal levels reliably mark cognitive sources; (2) faith (law, regulation, evidence) represents social cognition, while induction and paraphrase represent personal cognition, and inference mediates the transformation from personal to social cognition; (3) social functions (citation, depiction, distance, summary) correspond to these cognitive sources and are realized through characteristic discourse structures. Practically, this mapping supports more transparent interpretation of Chinese judicial texts and clarifies how evidentiality and reasoning are linguistically encoded. Future research should deepen the cognitive dimension by incorporating audience-specific reception (parties, practitioners, public), explore inter-semiotic operations in judgment interpretation, and develop larger shared corpora to enhance generalizability.

Limitations

The study focuses on linguistic markers without modeling differences across audiences (litigants, legal professionals, public), though audience understanding is crucial for judgments. While the corpus is large, broader shared corpora would further improve external validity. The analysis centers on textual discourse; inter-semiotic aspects (e.g., multimodal presentations, translations) are acknowledged but not fully examined. Legal discourse’s opacity and self-referentiality may also limit direct transferability of findings across jurisdictions or genres.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny