logo
ResearchBunny Logo
Introduction
The distribution of linguistic features across the world's 7000+ languages reflects a complex interplay of horizontal diffusion and vertical stability. Horizontal diffusion, driven by language contact, suggests that features spread through interaction between speakers of different languages. Vertical stability, or inheritance, indicates that features persist within a language family due to shared ancestry. The inherent stability of a linguistic feature varies; grammaticalized features tend to exhibit higher stability than lexical features. While both horizontal diffusion and vertical stability contribute to the observed distribution of linguistic features, most analyses focus on one process at a time. This study addresses this gap by focusing on nominal classification systems, which include grammatical gender, noun classes, and classifiers, to investigate the simultaneous impact of diffusion and stability in shaping the worldwide distribution of these features. These systems offer a useful case study due to their varying levels of grammaticalization; grammatical gender is the most grammaticalized, followed by noun classes, with classifiers being the least grammaticalized. The existing literature suggests that classifiers are more easily diffused horizontally than gender and noun classes, but gender and noun classes show greater vertical stability than classifiers. However, quantitative data on the interplay of these factors on the global distribution of nominal categorization systems have been lacking. This study aims to fill this gap and provide a deeper understanding of the forces shaping linguistic diversity.
Literature Review
Previous research has explored the stability and diffusibility of linguistic features, with a general understanding that grammaticalized features tend to be more stable than lexical ones. Studies have indicated that classifiers are more likely to be borrowed across language families, while gender and noun class systems show greater resistance to borrowing. However, existing databases on nominal classification systems are limited in size, hindering a comprehensive analysis of global distribution patterns. This study builds upon this existing literature by creating a substantially larger database, enabling a more robust investigation into the relative contributions of diffusion and inheritance to the geographical distribution of these systems.
Methodology
This research constructed a novel database of 3077 languages, annotated for the presence or absence of grammatical gender, noun classes, and classifiers. This significantly expands upon previous datasets, which were limited to 400 languages for classifiers and 257 for gender/noun classes. Data was gathered through automatic extraction from language grammars and grammar sketches, followed by manual verification using precise linguistic criteria and the Gramfinder tool. The researchers then employed Delaunay neighbors and phylogenetic neighbors to assess the areal (geographic) and phylogenetic (genealogical) cohesion of each nominal classification system. This method measures the degree to which languages sharing a feature are also geographically or genealogically close. High geographic cohesion suggests strong areal diffusion, while high phylogenetic cohesion suggests strong vertical inheritance. Wilcoxon rank-sum tests were used to compare the geographic and phylogenetic cohesion of the three systems. To investigate the relative roles of feature diffusion and language expansion, the researchers analyzed the distribution of language families within geographic grids, calculated the geographic coverage of language families, and examined the variance in environmental factors (elevation, distance to water, rainfall) across languages with each feature. These analyses provide insights into the spread of features through language contact versus migration. Statistical methods used include Wilcoxon rank-sum tests, Levene tests and Conover tests to analyze the variance of environmental factors.
Key Findings
The study found that classifiers are more frequent globally than gender or noun classes. However, contrary to expectations, the frequencies of classifiers and gender were relatively similar, significantly higher than noun classes. Areal analysis revealed that gender and noun classes exhibited significantly higher geographic cohesion than classifiers. Phylogenetic analysis showed that gender and noun class also had significantly stronger phylogenetic cohesion than classifiers, indicating greater vertical inheritance. Further investigation comparing language family density within geographic grids demonstrated that classifier languages show significantly higher family density than gender or noun class languages. This indicates classifiers are more likely to be present across different language families within the same geographic area. Conversely, examination of the geographic coverage of language families revealed that families with high geographic spread (e.g. Indo-European, Afro-Asiatic) frequently feature gender or noun class systems. Finally, analysis of environmental factors showed that classifiers had the largest variance in environmental conditions, suggesting their spread was less constrained by environmental factors which often influence migration and thus language spread. Gender and noun classes showed progressively smaller variance, consistent with the hypothesis that their spread was more influenced by language expansion constrained by environmental factors which impact human migration.
Discussion
The findings support the authors' hypothesis that the geographic distribution of nominal classification systems is shaped by two distinct mechanisms: feature diffusion and language expansion. Grammatical gender and noun classes, being more grammaticalized, appear to spread primarily through language expansion, associated with migrations and the dominance of specific language families in particular regions. Classifiers, being less grammaticalized and more lexical, appear to spread through feature diffusion through language contact. These results highlight the importance of differentiating between these two mechanisms when analyzing linguistic feature distribution. The observed patterns deviate from existing literature primarily due to the substantially larger dataset used in this study, revealing subtle but significant differences in the geographic distribution of these features.
Conclusion
This study demonstrates that language expansion and feature diffusion are distinct but equally important mechanisms shaping the global distribution of nominal classification systems. The findings challenge the conventional assumption that only less grammaticalized features diffuse readily, showing that grammaticalization interacts with the way features spread. Future research should replicate this analysis on other linguistic features (phonology, syntax, semantics) to test for generality and incorporate the impact of language expansion into evolutionary models of language change, providing a more nuanced and accurate view of linguistic diversity.
Limitations
The study's analysis does not explicitly account for all potential confounding factors, such as universal preferences, which may influence the spread of linguistic features independently of language expansion and feature diffusion. While the database is significantly larger than existing ones, there still remains the potential for sampling bias that could subtly affect the results. Future research might address this by exploring more sophisticated statistical models to mitigate the potential impact of such biases.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny