The Arts
Modeling narrative features in TV series: coding and clustering analysis
M. Rocchi and G. Pescatore
The paper advances a data-driven examination of systemic aspects of television series, conceptualizing them as narrative ecosystems whose evolution reflects internal narrative trends/constraints and external production, distribution, audience, regulatory and social factors. Focusing on US medical dramas—one of the most popular and durable TV genres—the study analyzes eight series (Grey's Anatomy, Miami Medical, The Night Shift, Chicago Med, Code Black, The Good Doctor, The Resident, New Amsterdam) over 32 seasons and 608 episodes, using a methodology that treats entire series as datasets to explore evolutionary dynamics and patterns in narrative structure. Building on prior interpretive work that posits three isotopies (plots) in medical dramas—medical cases (anthology), professional, and sentimental (running)—the paper quantitatively investigates these dimensions. Research questions: RQ1: Are isotopies (i.e., the medical cases plot, the professional plot, and the sentimental plot) good descriptors for the medical drama genre? RQ2: Are there any differences within the formulaic aspects of these series? Are there any significant differences considering the relationship between the three isotopies within different series? RQ3: How do the narrative plots of a medical drama change over time? The study aims to validate the isotopy framework as quantitative descriptors, compare formulaic differences across series, and assess temporal evolution of plot balances.
The work draws on the narrative ecosystems paradigm (Innocenti and Pescatore 2012, 2018; Pescatore et al., 2014; Rocchi and Pescatore, 2019) and semiotic notions of isotopy (Greimas and Courtés, Eco). It positions medical drama within genre/formula studies (Schatz, Jovanović; Albuquerque & Meimaridis) and leverages established content analysis traditions in media research (e.g., Fernández-Collado et al.; Signorielli & Bacue; Barker et al.; Chapoton et al.). It also references scriptwriting conventions (Campbell; Snyder; Vogler) as part of broader self-regulatory mechanisms of serial production. Prior medical drama scholarship and audience/health perception studies provide context for the genre’s cultural significance.
Corpus and unitization: The sample comprises eight US medical dramas across 32 seasons and 608 episodes. Episodes were manually segmented into units (segments) defined by spatio-temporal-action continuity and thematic-narrative invariance. The protocol employed ELAN for annotation. Isotopy definitions: Three plots (isotopies) were operationalized—medical cases plot (doctor–patient interactions, case stories constituting the anthology dimension), professional plot (workplace relationships: hierarchy, competition, ethics), and sentimental plot (intimate/emotional relationships among main characters: couples, friendship, family, conflicts). Coding and weighting: Each segment received one or more isotopy labels with a weight from 1–6, proportionally allocating segment time across overlapping isotopies (e.g., a 66 s segment with SP weight 4 and PP weight 2 yields 44 s SP and 22 s PP). Unattributable content was marked as uncoded (e.g., landscape shots, titles). This produced time series of narrative biomass (time share) per isotopy for episodes, seasons, and series. Reliability: Coders received training and a detailed coding guide; all coding was supervised and 15% of episodes were re-coded by the supervisor after two years to assess intra-coder reliability using Intraclass Correlation Coefficient. Episode-level ICCs exceeded 0.80: sentimental plot 0.97; professional plot 0.81; medical cases plot 0.92. The time intensity of manual coding was acknowledged as a constraint. Clustering framework: To test RQ1 and RQ2, clustering analyses were performed. For RQ1 (season-level), each season was represented by a 4D vector: median of SP, PP, MC, and uncoded percentages across its episodes. Clustering tendency was assessed via the Hopkins statistic (H = 0.634), rejecting spatial randomness. Multiple methods to estimate cluster number gave differing suggestions (Elbow=4, Silhouette=3, Gap=2). Hierarchical clustering (DIANA) was selected based on internal validity and stability metrics; cluster stability was evaluated using clusterboot with 100 bootstraps. For RQ2 (series-level typical episodes), each series was represented by a 4D vector (median percentages of SP, PP, MC, uncoded across all episodes). Clustering tendency again supported structure (Hopkins H = 0.535). Between-series differences in isotopy distributions at the episode level were tested using Kruskal–Wallis for PP, SP, and MC (all p < 0.0001). For RQ3, season-wise time series of isotopy shares were examined to track temporal evolution and potential inversions among dominant plots.
RQ1: The three isotopies are good quantitative descriptors of medical dramas. Season-level hierarchical clustering produced four stable clusters (clusterboot stability values near 1). Seasons tended to cluster by series. Example stabilities: Grey's Anatomy (GA) cluster stability = 0.978 (with first two GA seasons clustering with The Resident [TR], cluster stability = 0.925); a mixed cluster containing Chicago Med (CM), The Night Shift (TNS), The Good Doctor (TGD), and New Amsterdam (NA) showed stability = 0.992; Code Black (CB) and Miami Medical (MM) formed a stable cluster (0.955). RQ2: Significant between-series differences exist in PP, SP, and MC (Kruskal–Wallis: PP χ²(7)=132.95, p<0.0001; SP χ²(7)=272.56, p<0.0001; MC χ²(7)=348.87, p<0.0001). Typical episode (series-level medians) percentages:
- Grey's Anatomy (GA): PP 18.77%, SP 48.07%, MC 30.30%, uncoded 1.89%
- Miami Medical (MM): PP 8.57%, SP 20.30%, MC 67.70%, uncoded 3.43%
- The Night Shift (TNS): PP 5.91%, SP 35.89%, MC 55.70%, uncoded 1.82%
- Code Black (CB): PP 11.16%, SP 19.42%, MC 62.64%, uncoded 3.75%
- New Amsterdam (NA): PP 15.90%, SP 29.63%, MC 53.13%, uncoded 1.35%
- The Good Doctor (TGD): PP 9.29%, SP 35.00%, MC 54.48%, uncoded 1.23%
- The Resident (TR): PP 20.18%, SP 25.56%, MC 51.18%, uncoded 2.59%
- Chicago Med (CM): PP 9.42%, SP 26.21%, MC 64.64%, uncoded 0.75% Four narrative profiles emerged:
- Soap formula: GA—dominant sentimental plot (~48%).
- Anthology formula: MM and CB—high medical cases emphasis (MM ~68–70%; CB ~63–67%) and slightly higher uncoded content (~3%).
- Doctors and patients formula: CM, TNS, TGD—balanced patient/doctor stories; lower PP (about 6–9%).
- Social formula: NA, TR—elevated professional plot (NA ~16%; TR ~19%) emphasizing ethical/social dimensions. RQ3: Temporal evolution shows strong formulaic stability. Across 32 seasons, only two inversions between isotopies were recorded. GA shifted to a sentimental-dominant profile from season 3 (possibly coinciding with a scheduling change). TGD shows a potential soapward trend by season 3. TR and NA display higher professional emphasis (with TR season 1 particularly distinct). Overall, the prevailing order outside GA is MC > SP > PP, with PP consistently the lowest share.
The validation of the three isotopies as discriminative descriptors confirms their suitability for modeling medical drama narratives (addressing RQ1). Clusters aligned with series identities and production logics, indicating that aggregate narrative biomass reliably captures formulaic signatures. The presence of four stable narrative profiles (RQ2) clarifies how products differentiate within a common genre framework—e.g., GA’s soap orientation versus MM/CB’s anthology emphasis, and NA/TR’s social-professional tilt. The dynamic analysis (RQ3) reveals a high degree of stability and self-regulation in narrative variables: despite ongoing introduction of new cases and characters, the relative shares of isotopies remain stable over long runs. This suggests that beyond local creative choices, broader systemic constraints (episode length, network conventions, production economics, audience expectations) regulate narrative formulas. The findings support an ecosystemic approach wherein narrative variables are shaped by internal and external forces, and they highlight opportunities to use isotopy balances to compare series and anticipate trajectories.
The study introduces and validates a quantitative, data-driven framework for modeling TV series narratives via three isotopies—sentimental, professional, and medical cases—demonstrating their effectiveness in distinguishing series and identifying four robust narrative profiles within US medical dramas. It shows that, contrary to assumptions of fluid creative reconfiguration, the overall balance of narrative plots is highly stable over time, consistent with self-regulatory mechanisms in narrative ecosystems. Practical implications include improved understanding of product positioning for producers and potential inputs for content-based recommendation systems for viewers. Future research includes: expanding and diversifying the corpus (e.g., long-running ER, House; additional genres); generalizing isotopies to a cross-genre schema (soap plot, genre plot, anthology plot); conducting finer-grained dynamic analyses (e.g., sequence analysis) and integrating exogenous variables (production, distribution, regulation, audience metrics) into systemic models linking narrative features with industrial and contextual factors.
Manual segmentation and coding are time-consuming and dependent on coder expertise; while training, supervision, and intra-coder ICCs (SP 0.97; PP 0.81; MC 0.92) support consistency, coder misunderstandings can systematically bias isotopy balances. Automated segmentation tools were not available. The study aggregates at the season level to mitigate high episode-level variability, potentially obscuring fine-grained fluctuations. Optimal cluster number estimates were inconsistent across methods; hierarchical clustering was chosen based on internal/stability measures. The sample is limited to eight US medical dramas over 32 seasons; some comparative tests (e.g., post-hoc pairwise differences after Kruskal–Wallis) were beyond scope. Findings for younger series (e.g., The Good Doctor) are provisional pending additional seasons.
Related Publications
Explore these studies to deepen your understanding of the subject.

