logo
Loading...
The relationship between L2 vocabulary knowledge and reading proficiency: The moderating effects of vocabulary fluency

Linguistics and Languages

The relationship between L2 vocabulary knowledge and reading proficiency: The moderating effects of vocabulary fluency

Y. Tong, Z. Hasim, et al.

Explore the fascinating links between vocabulary knowledge and reading proficiency in EFL learners! This study by Yanli Tong, Zuwati Hasim, and Huzaina Abdul Halim reveals how receptive vocabulary significantly impacts reading skills, highlighting the crucial role of vocabulary fluency. Dive into the findings that could reshape vocabulary assessment and instruction.... show more
Introduction

Lexical knowledge is central to ESL/EFL reading proficiency, yet vocabulary is complex and multifaceted. Prior work distinguishes vocabulary breadth (how many words are known) and depth (how well words are known), with most studies focusing on receptive word knowledge in relation to reading. Far fewer studies consider productive word knowledge alongside receptive knowledge, and vocabulary fluency (speed of recognition, retrieval, and production) is rarely modeled despite theoretical importance. Chinese EFL learners often memorize word lists yet struggle to recognize and use vocabulary in context, underscoring the need to clarify which dimensions of vocabulary most support reading and how fluency influences these relations. The study aims to examine: (1) how important the four dimensions of L2 vocabulary knowledge (receptive breadth, receptive depth, productive breadth, productive depth) are for reading proficiency; (2) the extent to which reading proficiency can be predicted by L2 vocabulary knowledge; and (3) whether vocabulary fluency functions better as an independent predictor or as a moderator of the vocabulary–reading relationship.

Literature Review

Two main approaches to vocabulary knowledge are discussed: (a) developmental/cumulative (e.g., VKS) modeling staged growth from unfamiliarity to productive use; and (b) dimensional/components approaches that parse knowledge into breadth, depth, and fluency (e.g., Nation’s framework distinguishing form, meaning, use, each with receptive and productive aspects). Breadth reflects number of known form–meaning links; depth encompasses phonology, morphology, syntax, semantics, collocation, associations, etc.; fluency concerns speed/automaticity of accessing and using lexical knowledge. Prior research commonly links receptive breadth and depth to reading, with reported correlations ranging from large (e.g., Qian, 2002: r≈0.74–0.77; Stæhr, 2008: r≈0.83 for receptive breadth–reading) to more moderate in later meta-analyses (Zhang & Zhang, 2020: r≈0.57). Productive knowledge has been less examined; Cheng & Matthews (2018) reported productive breadth related most strongly to reading (r=0.57), provoking debate given reading’s receptive nature. Fluency has theoretical status equal to meaning input/output (Nation, 2020; Schmitt, 2014) but is rarely studied; Li & Zhang (2019) found breadth (β=0.36), depth (β=0.17), and a negative fluency coefficient (β=−0.22) for L2 listening, motivating treatment of fluency as a moderator rather than an independent predictor. Vocabulary testing formats matter: recognition-based tests (VLT/VST) can inflate size estimates due to guessing; recall-based formats (written form- or meaning-recall) are argued to predict reading more robustly (Laufer & Goldstein, 2004; McLean et al., 2020; Stoeckel et al., 2021; Stewart et al., 2021). Depth measures such as WAT target limited aspects (collocation, polysemy), suggesting tailored instruments aligned to research aims.

Methodology

Design: Quantitative study using confirmatory factor analysis (CFA), structural equation modeling (SEM), and Ping’s Single Indicator Interaction method to test moderation by vocabulary fluency. Participants: 312 Chinese EFL university students (mean age 20.5) recruited from one university in Southeast China via purposive then simple random sampling (six classes from six majors: Chinese, preschool education, business, computer, biology, chemistry). None had overseas experience. Sampling across majors aimed to mitigate single-site limitations rather than compare majors. Instruments:

  • Receptive vocabulary breadth (VLT adapted to meaning-recall): Based on Schmitt et al. (2001) VLT but converted to written meaning-recall. Levels at 2k, 3k, 5k, 10k, Academic; 30 items/level; target word embedded in an English sentence; test-takers translate the target into Chinese. Max score 150.
  • Receptive vocabulary depth (VDT): Adapted from Read (1998) to assess word parts (derivational POS), multiple meanings (with initials provided), and collocations—chosen as prerequisite depth aspects for Chinese learners. Max score 160 (2 points for two parts of speech, 1 for multiple meaning, 1 for collocation per item).
  • Productive vocabulary breadth (PLT, form-recall): Adapted from Laufer & Nation (1999) with recall format (Laufer & Goldstein, 2004). Levels at 2k, 3k, 5k, 10k, Academic; 18 items/level; L1 meaning given, write L2 form. Max score 90.
  • Productive vocabulary depth (PVDT, Definition Completion Test): Following Read (1995). 20 target words drawn from IELTS reading (5 nouns, 5 verbs, 5 adjectives, 5 adverbs). For each, provide a definition and an example sentence to elicit word parts, associations, collocations, and sentence structure. Max score 80 (2 points definition; 2 points sentence: 1 for structure, 1 for correct collocation/phrase; 0.5 deducted for misspelling).
  • Vocabulary fluency test (VFT): Timed dictation-based fill-in tasks using IELTS-style lecture passages to capture speed of recognition, retrieval, and written production under listening conditions (four passages, 20 targets each). Max score 80.
  • Reading proficiency: IELTS Academic Reading (three passages, 40 questions across multiple formats). Max score standardized to 100 (2.5 points/question). Procedure: Paper-based administration in three lecture halls. Four vocabulary tests: 35 minutes total; VFT: 25 minutes; IELTS Reading: 60 minutes. Scoring: As per instrument maxima above. Data analysis: CFA to confirm measurement properties; composite reliability (CR) > 0.7, AVE > 0.5 achieved. Discriminant validity supported (square root AVE > inter-construct correlations). SEM (AMOS 24.0) used to estimate relationships. Two SEMs tested: with VFT as independent predictor and without VFT. Moderation tested via Ping’s Single Indicator Interaction for each vocabulary dimension × fluency on reading. Model fit indices acceptable (e.g., with VFT: χ2/df=2.095, GFI=0.878, CFI=0.942, TLI=0.933, RMSEA=0.059, SRMR=0.045; without VFT: χ2/df=2.415, GFI=0.883, CFI=0.940, TLI=0.930, RMSEA=0.067, SRMR=0.045).
Key Findings
  • Receptive vs. productive vocabulary: Receptive lexical knowledge related more strongly to reading than productive knowledge.
  • Depth vs. breadth: Receptive vocabulary depth showed the strongest link to reading: Pearson r values: VDT–Reading r=0.513 (p<0.001); VLT–Reading r=0.480 (p<0.001); PLT–Reading r=0.356 (p<0.001); PVDT–Reading r=0.364 (p<0.001). VFT–Reading r=0.275 (p<0.001).
  • SEM standardized effects (model without VFT): VDT→Reading β=0.38; VLT→Reading β=0.33; PLT→Reading β=0.20; PVDT→Reading β=0.19.
  • VFT as independent predictor: Non-significant and negative (β=−0.02), indicating it should not be modeled as a direct predictor of reading.
  • Predictive power: Combined vocabulary knowledge explained approximately 49% of the variance in reading scores (R²≈0.49).
  • Moderation by fluency: Vocabulary fluency significantly moderated the relationships between vocabulary dimensions and reading (interaction terms significant in Ping’s Single Indicator Interaction analyses), indicating that higher fluency strengthens the impact of vocabulary knowledge on reading proficiency.
Discussion

Findings address the research questions by showing that receptive vocabulary, particularly depth (knowledge of word parts, multiple meanings, collocations), is more critical for L2 reading than productive vocabulary measures, aligning with reading’s receptive nature. Differences from earlier large correlation estimates likely reflect test format effects; recognition-based breadth tests (e.g., VLT/VST) may inflate size estimates via guessing, whereas recall-based formats used here yield moderate, arguably more realistic, associations. The predictive contribution of vocabulary knowledge to reading (≈49% variance explained) underscores vocabulary’s centrality in reading performance for Chinese EFL learners. Fluency did not act as a standalone predictor but played a significant moderating role, consistent with theoretical views of fluency as facilitating the transition from receptive to productive use—acting as a “booster” that enhances the effect of underlying knowledge on performance. Pedagogically, prioritizing receptive depth and breadth, then systematically advancing toward productive use while cultivating fluency, may optimize reading development. The results also suggest that assessment practices should incorporate depth-sensitive and recall-based measures to better align with reading proficiency.

Conclusion

The study demonstrates differential contributions of L2 vocabulary dimensions to IELTS reading proficiency: receptive depth exerts the strongest effect, followed by receptive breadth; productive breadth and depth show smaller but meaningful effects. Overall vocabulary knowledge explains roughly half of the variance in reading performance. Vocabulary fluency is best conceptualized as a moderator that amplifies the influence of vocabulary knowledge on reading, rather than as a direct predictor. Contributions include (a) clarifying the relative weight of receptive versus productive vocabulary knowledge for reading, (b) highlighting the value of depth-oriented and recall-based assessments, and (c) empirically establishing fluency’s moderating role. Future research should: (1) recruit multisite and more diverse samples to enhance generalizability; (2) broaden receptive and productive depth measures to encompass additional components (e.g., derivational morphology breadth, semantic networks, register/constraints on use); and (3) further refine and validate fluency measures and their interaction with knowledge components across skills and proficiency levels.

Limitations
  • Single-site sample from one university, though across six majors, limits generalizability; multisite sampling is recommended.
  • The receptive depth test covered limited aspects (two word-part categories, one multiple meaning, one collocation) and did not include all theoretically relevant dimensions of depth; broader depth coverage is needed in future studies.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny