logo
Loading...
Understanding the onset of hot streaks across artistic, cultural, and scientific careers

The Arts

Understanding the onset of hot streaks across artistic, cultural, and scientific careers

L. Liu, N. Dehmamy, et al.

This research conducted by Lu Liu, Nima Dehmamy, Jillian Chown, C. Lee Giles, and Dashun Wang explores the intriguing patterns of 'hot streaks' in creative careers, revealing how a balance of exploration and exploitation leads to periods of high impact across various fields. Discover how these dynamics shape artistic and scientific achievements!... show more
Introduction

The study investigates whether identifiable behavioral regularities precede the onset of hot streaks in creative careers. Building on literature that contrasts exploration (diversifying into new areas with higher variance in outcomes) and exploitation (focusing and deepening within established areas), the authors pose the central question: Are hot streaks reflective of exploration, exploitation, or a specific combination or sequence of the two? Given the unpredictability of when hot streaks occur within careers and the complexity of creative life cycles, the work seeks to quantitatively link creative strategies to the timing of hot streaks across art, film, and science.

Literature Review

Prior research documents hot streaks in creative careers and examines drivers of scientific impact, team dynamics, and career trajectories. The exploration–exploitation trade-off is central across organizational learning, sociology of science, and innovation studies: exploitation builds expertise and reputation but risks stifling originality, while exploration increases variance and can yield breakthroughs through novel combinations. Studies have shown increasing topic switching in science, the role of small versus large teams in disruption versus development, and pressures and institutional incentives shaping specialization. However, past work typically treats exploration and exploitation in isolation or combination, rarely as a sequential process at the career level. This gap motivates testing how these strategies temporally relate to hot streak onset.

Methodology
  • Data: Three domains spanning long creative careers.
    • Artists: >800,000 images covering 2,128 artists from museum and gallery collections.
    • Film directors: 79,000 films by 4,337 directors from IMDB; plot descriptions and cast information.
    • Scientists: 20,040 scientists combining Web of Science and Google Scholar records; citations measured as 10-year citations.
  • Representation learning:
    • Artworks: Transfer learning with a pre-trained VGGNet connected to fully connected layers fine-tuned on art style labels. Construct a 200-dimensional embedding per artwork by combining outputs from selected convolutional and fully connected layers, followed by PCA for dimensionality reduction.
    • Films: Create a 200-dimensional vector per film by concatenating a 100-dimensional word embedding of plot text and a 100-dimensional DeepWalk node embedding derived from a weighted co-casting network of actors.
    • Scientists: Build a weighted co-citing network among an individual’s papers (links if sharing references; weights by shared references). Identify research topics via community detection. Also validate with node embeddings on the co-citing network (results robust).
  • Impact and hot streak identification: Fit a hot-streak model where work impact is drawn from two normal distributions representing typical performance (I0) and hot-streak performance (IH). Impact proxies: artwork auction price, IMDB ratings, and 10-year citations.
  • Exploration–exploitation quantification: Compute style/topic entropy H = −Σ p_i log p_i, where p_i is the share of works in style/topic i; normalize by log n (n = number of works in the window) to obtain rescaled H in [0,1]. Low H indicates exploitation (focus), high H indicates exploration (diversity). Entropy is computed in sliding windows (e.g., six artworks, five films, or five papers; two-year windows for some analyses).
  • Null model: For each career, randomize the onset timing of the hot streak and recompute metrics over 500–1000 realizations to obtain expected distributions P(H) and baselines for z-scores and probabilities.
  • Tests and analyses:
    • Compare average entropy before and during hot streaks to randomized baselines using z-scores.
    • Align careers by hot streak onset to examine temporal dynamics of H versus null.
    • Kolmogorov–Smirnov tests comparing cumulative distributions of H before versus during hot streaks and against null.
    • Identify exploration and exploitation episodes (H above or below individual average) and measure the probability that hot streak onset coincides with exploration-only, exploitation-only, or transitions between states.
    • Team dynamics in science: Analyze team size around hot streak onset and compute R(team size) by comparing real to randomized careers; assess new collaborator influx.
    • Topic selection after exploration: Examine properties (recency, citation impact, popularity) of explored topics and predict which is later exploited using features from the exploration phase (achieving notable predictive performance).
  • Robustness: Stratify by hot streak timing, impact level, field; control individual fixed effects; regression controlling for impact, career stage, and individual characteristics; alternative community detection (e.g., Infomap), time aggregation schemes, and diversity measures (Simpson diversity, number of styles/topics, fraction in top style/topic, topic switching probability).
Key Findings
  • Entropy patterns relative to hot streaks:
    • Before hot streaks, average entropy is higher than expected, indicating increased exploration (z-scores: artists 4.24, directors 2.94, scientists 13.90).
    • During hot streaks, average entropy is lower than expected, indicating increased exploitation/focus (z-scores: artists −2.42, directors −8.54, scientists −22.71).
    • Temporal alignment shows H elevated before onset and dropping sharply below expectation during the hot streak.
    • Cumulative distributions confirm H during hot streaks is significantly smaller than before (KS p-values: artists 3.7e-6, directors 1.5e-5, scientists 1.1e-64), with no such pattern in randomized careers (p-values 0.23, 0.77, 0.06, respectively).
  • Sequence effects on hot streak probability:
    • Exploitation alone (not preceded by exploration) coincides with hot streak onset less often than expected across all domains.
    • Exploration alone (not followed by exploitation) also reduces the chance of coinciding with hot streak onset.
    • Exploration followed by exploitation is the only configuration with a positive lift over baseline, increasing hot streak initiation probability by 20.5% (artists), 13.8% (directors), and 19.2% (scientists).
  • Team dynamics in science:
    • Scientists tend to work with smaller teams before hot streak onset and larger teams during hot streaks. Team size drops before and increases during hot streaks relative to null; R(team size) < 1 before and > 1 during for larger team sizes. There is an increase in new collaborators at hot streak onset.
  • Topic choice after exploration (science):
    • The topic eventually exploited is less likely to be the most recent, highest cited, or most popular among those explored prior to the hot streak.
    • A prediction model using features from the exploration phase achieves accuracy 0.89 and AUC 0.83 in predicting which topic will be exploited.
  • Post–hot streak behavior:
    • After hot streaks end, average entropy becomes statistically indistinguishable from randomized baselines (z-scores between −1 and 1), indicating a return to typical diversity/focus levels.
  • Robustness: Results are consistent across domains, career stages, impact levels, fields, alternative methods and measures, and with controls for individual characteristics and collaboration effects.
Discussion

The findings directly address the research question by showing that hot streaks are not linked to exploration or exploitation in isolation but to a specific temporal sequence: exploration that is followed by exploitation. Exploration appears to broaden the search space and increase the chance of discovering promising ideas, while subsequent exploitation focuses effort to deepen and develop those ideas, enabling high-impact output. This sequential mechanism consistently precedes hot streak onset across art, film, and science, and is robust to multiple methodological checks. In science, organizational factors such as a shift from small to larger teams and engagement with new collaborators around hot streak onset align with the role of small teams in disruption and large teams in development. The results suggest that managing the timing and transition between experimentation and focused execution may be crucial for sustained creative impact and can inform talent identification and development strategies.

Conclusion

The paper uncovers a universal regularity across creative domains: hot streaks are most closely associated with a transition from exploration to exploitation, not either mode alone. By constructing high-dimensional representations of creative works and quantifying diversity in styles/topics, the study links creative strategy sequences to hot streak onset and reveals related collaboration dynamics in science. These insights offer a sequential perspective on creativity with implications for nurturing talent and structuring work. Future research should pursue causal designs to strengthen inference, examine domain-specific mechanisms and institutional contexts, and extend representation learning to capture richer creative dimensions. Investigating external factors (markets, networks, evaluation systems) and how feedback shapes the exploration–exploitation transition would further refine understanding.

Limitations
  • Data constraints: Analyses focus on individuals with sufficiently long, well-documented careers to enable statistical inference, potentially limiting generalizability to shorter or less-documented careers.
  • Correlational design: The study identifies associations rather than causal effects; unobserved confounders may influence both strategy shifts and hot streak onset.
  • Modest effect sizes: While statistically significant and consistent, effect sizes are moderate and may be augmented by additional controls (e.g., authorship, collaborations).
  • Domain-specific nuances: Differences in measurement, data structure, and norms across art, film, and science may affect observed magnitudes and patterns.
  • External influences: Market conditions, social network structures, disciplinary cultures, and short-term feedback may mediate or moderate the exploration–exploitation dynamics.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny