Linguistics and Languages

The developmental trajectories of L2 lexical-semantic networks

X. Feng and J. Liu

This groundbreaking study by Xuefang Feng and Jie Liu reveals how the lexical-semantic networks of Chinese EFL learners evolve with proficiency. Discover how advanced learners develop denser and more interconnected networks, while also facing challenges with less frequent words. This research provides valuable insights into language acquisition!... show more

Introduction

The study examines how lexical relations in the L2 mental lexicon change with proficiency. Prior research using free association shows that as proficiency increases, semantic associations rise while formal and syntagmatic associations decrease, suggesting evolving organization of lexical relations. Network analysis offers tools to characterize such structures. Existing L2 work indicates small-world and scale-free properties and sparser networks than L1, but lacks empirical comparisons across L2 proficiency levels. The present study compares macro- and meso-level structures of English lexical-semantic networks (occupation category) for Chinese EFL undergraduates (intermediate) and graduates (advanced), asking whether networks become more connected/structured and differently partitioned with proficiency. Hypotheses: (1) graduates’ network has higher average degree, density, clustering coefficient, shorter average path length, and lower centralization than undergraduates; (2) graduates’ network has stronger clustering reflected by higher average cluster density.

Literature Review

The paper reviews network science concepts (size, degree, density, clustering coefficient, average shortest path length, centralization) and their application to language networks. In lexical-semantic networks, words are nodes and edges reflect semantic relations inferred from semantic fluency tasks (co-occurrence proximity). Prior L1 studies show developmental changes: increased average degree and reduced path length with maturation; small-world and scale-free structures are common. L2 networks are also small-world and scale-free but typically sparser than L1. A vocabulary-growth study suggests L2 network structures evolve from discrete to more complete, with increased small-world/scale-free properties. Gaps remain: limited empirical testing of L2 network growth principles across proficiency levels and insufficient meso-/word-level analyses to capture dynamics of community structure and individual word behavior.

Methodology

Participants: 200 Chinese EFL learners from a central province in China, divided into 100 undergraduates (freshmen, ages 18–20, engineering majors; intermediate proficiency) and 100 English-major graduates (ages 22–24; translation/interpretation majors; >28 hours/week in English; all passed TEM8; advanced proficiency). All began English around age 9; no >1 year residence in an English-speaking country. Oral informed consent obtained. Task: L2 semantic fluency in the category of occupation; written/typed online; 1-minute time limit. Preprocessing: removed non-occupation and Chinese responses; ambiguous misspellings removed; obvious misspellings corrected; punctuation/particles removed; lowercasing; lemmatization. Totals retained: undergraduates 797 responses (96.2% of total), graduates 1060 (99.1%). Network construction: Two-mode (Student×Response) matrices constructed, then projected to one-mode (Response×Response) co-occurrence matrices using UCINET 6. Nodes are responses; weighted edges reflect co-occurrence within lists. Networks visualized with VOSviewer 1.6.17. Community detection: VOSviewer clustering maximizing a modularity-like function V(C) with resolution parameter γ=1 (coarse-grain clustering). Clusters analyzed for size, density, and membership. Metrics: Macro-level measures included average degree, density, clustering coefficient, average shortest path length, centralization. For comparability, analyses focused on nodes with occurrence frequency >2 (yielding 66 nodes undergraduates; 86 nodes graduates). Statistics: Normality assessed via Shapiro–Wilk and Q–Q plots. t-tests used for normally distributed measures (e.g., number of responses per participant). Non-parametric Mann–Whitney U tests for degree and clustering coefficient differences between networks. Wilcoxon Signed Ranks tests for paired comparisons of common words across networks (frequency and degree changes). Centralization significance assessed by simulating 100 Erdős–Rényi random networks matched in size and density/edges and comparing observed vs expected means.

Key Findings

Fluency output: Graduates produced more responses per participant than undergraduates (10.60 vs 7.97; t=6.782, df=198, p=0.000). Unique words after deduplication: graduates 191, undergraduates 140.
Network size and clusters (full visualization): Graduates’ network is larger and more spread-out; central word in both is "teacher" (links: undergraduates 135; graduates 188). More clusters detected in graduates’ network (16) than undergraduates’ (13).
Macro-level comparison (nodes with frequency >2: undergraduates n=66; graduates n=86):
- Average degree : undergraduates 58.12; graduates 103.83 (significant at p<0.01).
- Density D: undergraduates 0.561; graduates 1.222.
- Clustering coefficient CC: undergraduates 1.691; graduates 3.056 (p<0.01).
- Average shortest path length L: undergraduates 1.558; graduates 1.553.
- Centralization C(%): undergraduates 15.49; graduates 11.48 (p<0.01).
- Random network benchmarks: Lrandom 1.558 (undergrads), 1.553 (grads); CCrandom 0.448 (undergrads), 0.441 (grads); Crandom 14.55 (undergrads), 13.41 (grads). Both networks show CC >> CCrandom with similar L to random -> small-world.
- Mann–Whitney U: degree z=-3.997, p=0.000; clustering coefficient z=-7.672, p=0.000, indicating higher connectivity and clustering in graduates’ network.
- Centralization comparison via simulations: graduates significantly less centralized (t=-8.445, p=0.000).
Community structure (7 clusters analyzed in each network):
- All words in graduates’ network belong to cohesive clusters; undergraduates have three isolated single/dual-word clusters (accountant, banker, typist).
- Average cluster density: graduates 2.453 vs undergraduates 1.960; variance of cluster densities: graduates 1.515 vs undergraduates 1.700, indicating more even distribution of connectivity in graduates’ network.
Overlap and word-level dynamics:
- 47 words common to both networks, constituting 71.21% of undergraduates’ word set. These common words have higher generation frequency (z=3.454, p=0.001) and degree (z=-3.089, p=0.002) than the 19 undergraduates-only words.
- For the 47 common words, increases from undergraduates to graduates are significant: frequency (61.70% increased; Wilcoxon z=-3.044, p=0.002) and degree (z=-5.709, p=0.000). Largest degree increases: teacher +389, doctor +356, nurse +284, engineer +194, actor +188.
- Words lost from undergraduates’ to graduates’ network are typically low-frequency and sparsely connected; exception: cellist (freq 16, degree 80) present in undergraduates but absent in graduates, possibly due to recency and lack of consolidation. Words in dense clusters (e.g., undergraduates’ Cluster 2) tend to be preserved, even if low-frequency.
Overall: With higher proficiency, the L2 lexical-semantic network becomes more connected, more locally clustered, and less centralized, supporting more efficient lexical processing.

Discussion

Findings address the research questions by showing that advanced L2 learners have networks with higher connectivity (average degree, density), stronger local clustering, slightly shorter global distances, and lower centralization, implying more efficient activation and retrieval. At the meso-level, community structures are denser and more evenly connected for advanced learners, with no isolated words, aligning with higher overall clustering. Word-level analyses reveal preferential retention and strengthening of frequent, highly connected central words, while weakly connected, less frequent words are more prone to be lost unless embedded in dense clusters. The study extends L1 developmental findings (increasing average degree and decreasing path length) to L2, but diverges in clustering coefficient: it increases with L2 proficiency (vs. decreases reported in L1), possibly due to classroom-based L2 learning emphasizing local semantic neighborhoods (synonyms, near-neighbors), whereas L1 growth may integrate words into more global structures. The meso-level results help explain macro-level shifts (reinforced small-world/scale-free characteristics through denser communities and central-word attachment). Community density likely facilitates spreading activation and narrows search space, aiding retrieval, offering a mechanistic account linking network structure to processing efficiency.

Conclusion

The study demonstrates that L2 lexical-semantic networks develop toward greater connectivity, denser local clustering, and reduced centralization as proficiency increases, with clearer and more robust community structures. Central, frequent, and well-connected words stabilize and attract new links, while weakly connected, infrequent words risk attrition unless situated in dense clusters. These patterns provide empirical support for the preferential attachment model of lexical-semantic network growth and connect meso-/word-level dynamics to macro-level small-world and scale-free properties. The work bridges macro-level L2 network descriptions with observable word behavior in fluency tasks and suggests that optimizing community structure and learning environments could bolster vocabulary proficiency. Future research should expand to multiple semantic categories, adopt longitudinal designs, and incorporate individual-level networks to capture idiosyncratic structures.

Limitations

Participant majors differ (engineering undergraduates vs English-major graduates), so specialized knowledge may influence responses despite using a familiar category; longitudinal within-subject designs could reduce this confound.
Single semantic category (occupation) limits generalizability; broader category coverage is needed.
Networks are group-aggregated, assuming shared structure; substantial individual variability likely exists. Future work should estimate and analyze individual networks alongside group networks.

Related Publications

Explore these studies to deepen your understanding of the subject.

Economics

The influence of the social networks of fund managers on the herding behavior of SIFs in China

L. Wang, Y. Wang, et al.

Social Work

Speak, memory: the postphenomenological analysis of memory-making in the age of algorithmically powered social networks

O. Kudina

Computer Science

Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data

C. H. Martin, T. (. Peng, et al.

Humanities

Analysing inter-state communication dynamics and roles in the networks of the International Institute of Intellectual Cooperation

R. Rodríguez-casañ, E. Carbó-catalan, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny