logo
ResearchBunny Logo
Organizational cultural strength as the negative cross-entropy of mindshare: a measure based on descriptive text

Business

Organizational cultural strength as the negative cross-entropy of mindshare: a measure based on descriptive text

A. Marchetti and P. Puranam

This paper introduces an innovative way to measure the strength of organizational culture through cross-entropy of members' mindshare distributions. Using data from Glassdoor.com, the authors reveal compelling insights into how firm characteristics relate to cultural strength, highlighting the intriguing dynamics involving role differentiation and gender balance. Research conducted by Arianna Marchetti and Phanish Puranam.... show more
Introduction

The paper addresses how to conceptualize and measure the strength of organizational culture—defined as the extent to which members think cohesively and assign importance to the same cultural attributes—in a way that is reliable, valid, scalable, and agnostic to specific cultural content. The authors argue that cultural strength is theoretically important for understanding non-hierarchical organizing, resilience (e.g., during communication-constrained crises like COVID-19), and potential downsides such as groupthink and resistance to diversity. They identify three challenges impeding progress: (1) a conceptual bridge from intuitive notions of strong culture to a formal, content-agnostic operationalization incorporating both consensus and intensity; (2) measurement approaches that respect firm-specific (emic) cultural elements rather than imposing universal (etic) dimensions; and (3) data scale and representativeness. They propose formalizing cultural strength as a relational property of individuals' mindshare distributions over firm-specific cultural attributes and estimating these from employees' descriptive text using topic modeling on large-scale Glassdoor reviews.

Literature Review

Early work variously defined strong cultures as homogeneous, thick and widely shared, stable and intense, cohesive and tight, or exhibiting both consensus and intensity. Later formal treatments (e.g., Van den Steen, 2010a,b) emphasized homogeneity in members' beliefs and proposed correlates such as age, size, and employee composition. Empirical studies typically relied on surveys: Kotter and Heskett (1992) developed scales based on executives' assessments, linking strong culture to performance; Burt et al. (1994) and Sorensen (2002) identified contingencies (competition, environmental volatility). Gordon and DiTomaso (1992) operationalized strength as low variance in responses on preselected dimensions; Chatman et al. (2014) measured consensus and intensity around adaptability and found performance benefits. Limitations include conceptual ambiguity over whether consensus, intensity, or both define strength, and methodological issues from etic survey lists that may bias measurement by ignoring firm-specific cultural content, along with small samples. The authors argue for an emic, scalable, content-agnostic, distribution-based operationalization.

Methodology

Concept and formalization: Building on Chatman and colleagues' view, the authors define a strong culture as one where members reach consensus on attributes that they individually deem highly important. Formally, for an organization with N members and K firm-specific cultural attributes, let each individual i have a mindshare distribution θ_i over K attributes. Cultural strength S is defined as the negative average pairwise cross-entropy across members: S = -(2 / [N(N-1)]) * sum over i≠j of CE(θ_i, θ_j), where CE(θ_i, θ_j) = ([H(θ_i) + D_KL(θ_i || θ_j)] + [H(θ_j) + D_KL(θ_j || θ_i)]) / 2, H is entropy and D_KL is Kullback–Leibler divergence. Higher S (closer to zero) indicates stronger culture (high consensus and high intensity). This formulation captures both intra-individual intensity (concentration of mindshare) and inter-individual similarity, on an emic, firm-specific attribute space.

Data and sample: The study uses 2,900,436 employee reviews across 94,868 US companies in 68 industries from Glassdoor (2008–2020). To mitigate representativeness concerns, they restrict to 10 macro-industries with at least 25% average firm-level Glassdoor coverage, and require at least two reviews per firm. Average reviews per firm are 29; average coverage is ~29% for sampled firms. Reviewer demographics and satisfaction distributions are reported (e.g., average age 36.45; near-equal gender split among reviewers).

Mindshare estimation via text modeling: For each firm, the authors run unsupervised topic modeling (Latent Dirichlet Allocation) on firm-specific corpora to uncover the set of organization-specific topics employees discuss. Each review yields a topic-mixture vector treated as the reviewer’s mindshare distribution over cultural attributes. Key implementation choices: (1) broad inclusion of review text (no pruning to explicitly cultural passages) recognizing attributes can be values, norms, beliefs, or artifacts; (2) the number of topics per firm is tuned by maximizing topic coherence (reduces noise vs. arbitrary K); (3) baseline uses only the “pros” section (more likely to describe actual present attributes; “cons” may reflect heterogeneous desired but absent attributes), with robustness using “cons”; (4) fully emic estimation by running LDA separately per firm to avoid imposing shared cross-firm dimensions. The average coherence-maximizing topics per firm is 10. The resulting per-review mindshare vectors are combined to compute pairwise cross-entropy and the aggregate S per firm.

Validation approach: With no gold standard dataset, validation is indirect and theory-driven. The authors test bivariate correlations and multivariate OLS associations between S and theoretically relevant covariates (size, age, tenure, geographic dispersion), and perform robustness checks (bootstrapping; alternative ordinal specification due to multimodal S distribution; replication using “cons”). They also examine potential response polarization (policy-induced reduction on Glassdoor; kurtosis tests of satisfaction ratings showing negative excess kurtosis) and interpret results conservatively as culture of reviewer subpopulations if representativeness is uncertain.

Key Findings
  • Distribution and scale: The cultural strength metric (computed on “pros”) is multimodal across 94,868 firms; average coherence-maximizing topic count per firm is 10.
  • Theory-consistent associations (bivariate, “pros”): Cultural strength correlates negatively with firm size (r ≈ -0.14, p<0.001) and geographic dispersion measured by number of states (r ≈ -0.16, p<0.001) and cities (r ≈ -0.08, p<0.001); positively with firm age (r ≈ 0.05, p<0.001) and reviewers’ average tenure (r ≈ 0.01, p≈0.04).
  • Exploratory associations (bivariate, “pros”): Cultural strength correlates negatively with Glassdoor occupational codes (organizational role differentiation; r ≈ -0.25, p<0.001) and positively with gender imbalance (absolute female–male share difference; r ≈ 0.20, p<0.001).
  • Multivariate OLS with controls and fixed effects (industry, state, firm type; “pros”): Results largely mirror bivariate signs. Significant coefficients: firm size β=-0.058 (p<0.001); firm age β=0.003 (p<0.001); reviewers’ average tenure β=0.068 (p<0.001); occupational codes β=-0.047 (p<0.001); gender imbalance β=1.419 (p<0.001). Number of states loses significance; number of cities turns positive (β≈0.006, p≈0.065). Controls: revenues negative; higher share of current-employee reviews negative; review count positively associated (mechanical coverage effect). Fixed effects are significant, indicating meaningful between-domain variation.
  • Replication on “cons”: Cultural strength measures from “pros” and “cons” correlate (r≈0.55, p<0.001) but capture different information. For “cons,” older firms and longer tenure associate with lower cultural strength (older firm effect significant; tenure effect weaker), consistent with convergence in likes but divergence in dislikes over time.
  • Industry, geography, and firm type patterns: Average adjusted cultural strength differs more between industries, states, and firm types than within them (e.g., higher in Bars & Nightclubs; lower in Enterprise Software & Network Solutions; state averages vary, with CA, MA relatively lower and several smaller states higher).
Discussion

The findings support the proposed conceptualization of cultural strength as jointly reflecting consensus and intensity on emic cultural attributes. The measure behaves in line with theory: larger scale and geographic dispersion hinder shared understanding; age and tenure, proxies for socialization and institutionalization, facilitate stronger cultures. Significant fixed effects suggest isomorphic pressures within industries, geographies, and organizational types shape cultural strength, implying that cultural strength is not uniformly distributed but patterned by contextual domains. Exploratory results point to mechanisms: greater role differentiation (more occupational codes) likely fragments culture into subcultures; gender imbalance suggests sorting-based homogeneity can strengthen culture, raising concerns about potential lock-in that may resist diversity initiatives. Differences between “pros” and “cons” metrics imply that convergence may be greater on appreciated attributes than on sources of dissatisfaction, highlighting asymmetric cultural cohesion across positive vs. negative facets of organizational life. Overall, the measure offers a scalable tool to study culture’s role in coordination, the substitutability of culture and formal structure, and cross-organizational phenomena (e.g., M&A cultural compatibility).

Conclusion

The paper formalizes organizational cultural strength as the negative average cross-entropy of members’ mindshare distributions over firm-specific cultural attributes and operationalizes it using topic-modeling of large-scale employee-generated text. This clear, content-agnostic, emic, and scalable metric resolves conceptual ambiguities about strength and enables differentiation between bland (indifferent) and fragmented (conflicted) weak cultures. Empirical results across nearly 95,000 firms show theory-consistent patterns (smaller, older, geographically concentrated firms, and firms with longer tenure exhibit stronger cultures) and reveal exploratory links with role differentiation and gender imbalance. Future research can apply the metric to study specific cultural content (e.g., adaptability), test functional equivalence between culture and structure, assess cultural compatibility in inter-organizational ties, and evaluate cultural resilience under distributed work. Practically, managers can monitor culture strength dynamics under sorting and socialization, evaluate cultural fit, and benchmark competitors, potentially leveraging cultural strength as a strategic resource.

Limitations
  • Representativeness: Glassdoor reviewers are self-selected; while coverage thresholds, prior evidence on reduced polarization, and kurtosis tests mitigate concern, the measure may reflect reviewer subcultures rather than full organizational populations.
  • Measurement choices: Using “pros” emphasizes existing attributes; “cons” captures desired absences and is more heterogeneous. Results differ somewhat across these views.
  • Lack of external criterion: No publicly available gold standard to triangulate cultural strength; validation is indirect via theory-consistent associations.
  • Model specification and data constraints: Multivariate models require extensive controls and fixed effects, reducing sample size substantially (~75% drop from bivariate set), which may influence estimates; correlational design precludes causal inference.
  • Emic topic estimation: Running firm-specific LDA avoids etic bias but may introduce variability across firms in topic spaces; coherence-based K selection mitigates but does not eliminate this concern.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny