Humanities
Gender politics and Victorian literary representation of the body: a distant reading of the body in Charles Dickens's works
H. Chen and Q. Xu
Dive into 19th-century discourse on the human body through the lens of Charles Dickens's works. This research conducted by Houliang Chen and Qianwen Xu employs corpus linguistics to reveal the intricate vocabulary of Victorian authors and highlights the gender disparities in agency depicted in Dickens's narratives. Discover how time, space, and etiquette shape the portrayal of bodies in this captivating analysis!
~3 min • Beginner • English
Introduction
The article examines how the Victorian era’s preoccupation with the body intersected with etiquette, class, and especially gender, during a period of hyperbolic gender difference. It argues that the body, conceptualized through its perceptible parts (e.g., hands, feet, eyes), became central to Victorian subjectivity and gender antagonisms. To complement traditional close reading, the study adopts distant reading and computational methods aimed at explaining general structures across large corpora. Building on Moretti’s distant reading and Unsworth’s call for methodological interactivity, the study leverages vocabulary-based analysis to uncover ideological differences encoded in lexical choices. The research focuses on two corpora: a Dickens corpus and a contemporaneous nineteenth-century reference corpus. Stage 1 compares nouns related to body parts across the corpora to identify similarities, differences, and implications. Stage 2 performs a gender-based analysis of verbs to examine disparities in agency between male- and female-associated body parts in Dickens’s works and in the reference corpus.
Literature Review
The study situates itself within scholarship on Victorian gender ideology and embodiment (e.g., Tosh on polarized gender roles; Michie on hyperbolic gender difference; Lewis on the cultural body). It notes the fragmentary study of body parts linked to anatomical discourse (Mann and Gavin; Cregan). Methodologically it follows Moretti’s distant reading, Unsworth’s criteria for computational humanities, and Fairclough’s view that ideology is coded in vocabulary. It draws on Sinclair’s concept of semantic prosody (via Louw) to capture attitudinal and pragmatic meanings in lexical patterns, and on Goss’s insights into bodies and sentiment. It also references scholarship on Victorian etiquette and gender performance (e.g., Langland; Aster; Paterson; Beaujot) in discussing items like hats and pockets.
Methodology
Corpora and preprocessing: The observed corpus comprises 33 Dickens works (novels, novellas, short stories, travel writing). The reference corpus contains 356 contemporaneous nineteenth-century works (1800–1899) excluding Dickens, sourced from the Oxford Text Archive; texts not downloadable or not available as plain TXT were excluded. Standard preprocessing included data collection, text standardization, tokenization, stopword removal, stemming/lemmatization, noise removal, annotation/NER, and bag-of-words construction. Tools: AntConc for concordance/KWIC and clustering; Python (NLTK, pandas) for tokenization, POS-based extraction and frequency analysis; Gephi for network visualization. Sub-corpora by body part: Using AntConc on the Dickens corpus, the ten most frequent body nouns were identified: hand (9003), eye (6007), head (5709), arm (2849), foot (1781), shoulder (959), ear (940), finger (765), breast (739), mouth (685). A context window of five words on each side was adopted to capture collocations. Ten sub-corpora were built for these body parts in both the Dickens and reference corpora, then each sub-corpus was split into nouns and verbs via POS tagging (NLTK). Noun analysis and clustering: For each corpus’s ten noun sub-corpora, TF–IDF vectorization was applied, followed by K-means clustering with K from 2 to 9. Performance was evaluated via silhouette score and Calinski–Harabasz index. Optimal K was determined as 7 for both corpora (relatively higher Calinski–Harabasz around K≈2 and declining silhouette from 7–9 indicating over-partitioning). Network visualization: Gephi was used to visualize cluster internals and inter-cluster relations. Modularity resolution was set to 1.2 yielding 7 communities; top 30 nodes by Weighted Degree were retained; layouts used Force Atlas then Fruchterman–Reingold. Verb and gender analysis: To assess agency and gendered action, both corpora were split via pronouns his/her; all verbs in a context window of 10 were extracted and tagged (NLTK). Twenty verb datasets (male/female × 10 body parts) were created for Dickens and replicated for the reference corpus. Frequencies and normalized rates (via pandas) were computed; top-10 verbs were compared for hand, head, eye (the three most frequent body parts). Pronoun proportions: Counts of his, her, and other pronouns were tallied to compute male-to-female ratios for each body part in Dickens and in the nineteenth-century reference corpus. Additional queries: AntConc was used for specific pattern searches and clusters (e.g., pocket(s) 4-grams; phrases like toss head; collocation likelihoods such as kiss with hand).
Key Findings
- High-frequency body parts in Dickens: hand (9003), eye (6007), head (5709), arm (2849), foot (1781), shoulder (959), ear (940), finger (765), breast (739), mouth (685). - Gendered frequency skew in Dickens for major parts: hands 4907 (male) vs 2181 (female) ≈ 69:31; heads 3260 (male) vs 1467 (female) ≈ 69:31; eyes 2769 (male) vs 1512 (female) ≈ 65:35—indicating more male-focused bodily description. - Noun networks (top 30 nodes): 22 repeated nodes across Dickens and the reference corpus: hand, head, eye, arm, foot, shoulder, ear, finger, breast, mouth, face, eyes, right, round, hands, hair, time, side, tears, left, ground, neck. Non-body repeats (right, left, round, time, side, tears) show a broad Victorian tendency to situate bodies in time/space and emphasize emotion (tears). - Unique nodes: Dickens—chair, gentleman, door, pockets, child, hat, fire, lady; Others—heart, moment, feet, lips, way, back, voice, water. About 73% of Dickens’s top noun nodes overlap with the reference corpus, but Dickens features more social-role and domestic-object terms, linking bodies to etiquette, household objects, and attire. - Pockets and gender: Co-occurrence with possessive pronouns in Dickens: his vs her ≈ 664 vs 74 (nearly 10:1), evidencing gendered access to property/objects. Clustered 4-grams around pocket(s) show varied male pocket types (waistcoat, coat, breast, side) and contents (keys, money, papers, letters). - Hand verbs: Across corpora, the most frequent are put, say, lay, take, hold. Dickens, unlike many contemporaries, also favors give, clasp, look, press for both genders. Male hands collocate with kiss in Dickens and others; in Dickens, kiss appears 845 times overall, with hand ranking third among collocates (likelihood ≈ 296.713), reflecting male hand etiquette. Fold hands occurs more for females (42) than males (34) in Dickens; distribution varies by work (e.g., David Copperfield emphasizes Betsy Trotwood’s folded hands). - Head verbs: shake, say, turn, look, raise, put appear across male/female and both corpora, suggesting head is less gender-differentiated in action. Dickens uses toss and bend specifically for female heads; AntConc finds toss head 67 times, 54 involving female characters (e.g., Nicholas Nickleby), signaling stylized feminine attitudes (defiance, pride, disdain). - Eye verbs: Common Victorian verbs include say, raise, look, fix, turn, see, have. In Dickens, male eyes more often pair with open/close/keep; female eyes more often pair with dry/wipe, reinforcing a stereotype of greater female emotional expressiveness. For dry eyes, women:men ≈ 43:13. Dickens also frequently uses cast with both male and female eyes, indicating emphasis on eye-direction in dialogue and social interaction. - Overall, nineteenth-century authors (including Dickens) rely on time/space nouns in bodily depictions and foreground etiquette. Dickens, however, more explicitly encodes gender politics via social roles (gentleman/lady), domestic objects, and attire (pockets, hats), and gender-skewed verb associations.
Discussion
The findings demonstrate that vocabulary surrounding body parts encodes Victorian ideologies of gender, space, and etiquette. The overlap between Dickens and his contemporaries in time/space nouns and etiquette-related actions supports the view that Victorian fiction deploys the body to situate scenes and social norms. Dickens’s distinctive emphasis on social titles (gentleman, lady), domestic objects (chair, door), and sartorial elements (pockets, hats) shows how bodily representation is mobilized to perform and police gendered respectability and class-coded behaviors. Verb patterns reveal differential agency: men’s bodies, especially hands, perform socially coded gestures (e.g., hand-kissing, accessing items from pockets), while women’s bodies are more often associated with emotional displays (drying/wiping eyes) or stylized gestures (tossing the head). Head actions appear comparatively ungendered, suggesting some bodily domains are less leveraged for gender differentiation. Collectively, these patterns address the research aim by quantifying and visualizing how Dickens’s lexical choices both reflect broader Victorian conventions and uniquely intensify gender politics through object- and etiquette-centered bodily depiction.
Conclusion
The study integrates distant reading with lexical analysis to compare Dickens’s bodily representations with those of his contemporaries. It shows broad Victorian commonalities (time/space nouns; etiquette-laden actions; male hand-kissing) while highlighting Dickens’s distinctive gendered encoding through social roles, domestic objects, and attire, and through gender-skewed verb collocations for key body parts (hands, eyes). These results contribute a multi-dimensional, corpus-driven perspective that complements close reading and underscores how language constructs gendered embodiments in Victorian fiction. Future research could broaden temporal comparisons by using the entire nineteenth century as the observed corpus and the eighteenth or twentieth century as reference, to trace continuities and shifts in the literary construction of the body and gender.
Limitations
- Reference corpus constraints: Only texts available as downloadable plain TXT from the Oxford Text Archive (1800–1899) were included; unobtainable items were discarded. - Representativeness and genre mix: Both observed and reference corpora include mixed genres (novels, novellas, short stories, travel writing) to maintain comparability, but this heterogeneity may influence lexical distributions. - Preprocessing choices (tokenization, stopword removal, stemming/lemmatization, POS tagging) and parameter settings (context windows, clustering K=7, Gephi modularity resolution, top-30 node cutoff) may affect outcomes. - Pronoun-based gender attribution (his/her) operationalizes gender for verb analysis but cannot capture all nuances of character gender identity or indirect references. - Some metrics/visuals (e.g., silhouette/Calinski–Harabasz behavior) indicate potential sensitivity to clustering granularity; network visualizations reflect top-weighted nodes only.
Related Publications
Explore these studies to deepen your understanding of the subject.

