Business
How People Use ChatGPT
A. Chatterji, T. Cunningham, et al.
This research, conducted by Aaron Chatterji, Thomas Cunningham, David J. Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman, maps ChatGPT's explosive global adoption from November 2022 to July 2025 and reveals a shift toward non‑work use, dominant topics like practical guidance and writing, and evidence that chatbots deliver important decision‑support value in knowledge‑intensive jobs.
~3 min • Beginner • English
Introduction
The paper investigates how people use ChatGPT, motivated by unprecedented global diffusion of LLM chatbots since November 2022. By July 2025, approximately 700 million weekly active users sent 18 billion messages per week, around 10% of the global adult population. Despite widespread adoption, public evidence on actual use is limited beyond self-reported surveys, which may be biased. The authors aim to directly classify and measure conversation content at scale to understand the distribution of work and non-work usage, topics, user intents (Asking, Doing, Expressing), and associated work activities (via O*NET). Contributions include: larger user base than prior studies, automated privacy-preserving classification of message types, diffusion analysis across cohorts and demographics, and secure linkage to aggregated employment/education categories via a data clean room. The central hypothesis is that ChatGPT’s economic value arises largely through decision support in knowledge-intensive contexts, not solely through automated task execution.
Literature Review
The study situates itself among work on AI’s macroeconomic impacts (Acemoglu, 2024; Korinek and Suh, 2024), labor market implications (Eloundou et al., 2025; Hartley et al., 2025; Humlum and Vestergaard, 2025a,b), and surveys of LLM adoption (Bick et al., 2024; Pew Research Center, 2025). It extends prior analyses that directly classify chatbot conversations (Handa et al., 2025; Tomlinson et al., 2025) by leveraging a larger, more representative ChatGPT user base and new taxonomies (topics, intents, O*NET activities). The authors discuss survey biases (Ling and Imas, 2025) and contrast their findings with those reporting high programming or companionship shares (Handa et al., 2025; Zao-Sanders, 2025). Methodological precedents include automated classifier-based labeling of chatbot content (Phang et al., 2025; Eloundou et al., 2025) and secure linkage of platform data to external sources (Chetty et al., 2022). The paper frames generative AI’s unique capabilities (long-form outputs, multi-modal) relative to traditional search and information technologies.
Methodology
Data consist of three main sources: (1) Growth dataset of all consumer-plan (Free, Plus, Pro) message volumes and de-identified metadata from November 2022 to September 2025; (2) Classified messages datasets based on automated LLM classifiers applied to randomly sampled, PII-scrubbed messages between May 2024 and June 2025 (approximately 1.1 million conversations sampled across all users, plus two samples from a subset of ~130,000 users with conversation-level and user-level sampling up to six messages each); and (3) Aggregated employment and education categories for a subset of users, procured and analyzed solely within a secure Data Clean Room. Privacy safeguards include: a Privacy Filter to remove PII; no human inspection of message text; controlled interfaces that prevent rendering raw content; classification using prompts with up to 10 prior messages as context (messages truncated to 5,000 characters); use of GPT-5-mini for most classifiers and GPT-5 for interaction quality; and strict data clean room protocols requiring partner approval, aggregation thresholds (>=100 users), and audit logging. Classifiers cover: work-related vs non-work; conversation topics (mapped from 24 categories into seven groups); intent (Asking, Doing, Expressing); O*NET Intermediate Work Activities (333 categories including augmented taxonomy, aggregated to GWAs); and interaction quality (Good/Bad/Unknown) based on subsequent user messages. Validation compares model outputs with human judgments on the WildChat public dataset, reporting agreement metrics (Fleiss’ κ, Cohen’s κ), confusion matrices, and bias analyses; additionally, interaction quality correlates with users’ explicit thumbs up/down feedback on assistant messages. Sampling weights reweight observations to maintain fixed ratios with aggregate message volumes and to account for time-varying sampling rates. Exclusions: users opted out of training, self-reported under 18, logged-out users (due to inconsistent availability), deleted or banned accounts, and non-consumer plans.
Key Findings
Adoption and volume: By July 2025, ChatGPT had over 700 million weekly active users, sending ~18 billion messages per week (~2.5 billion/day), representing around 10% of the global adult population. Work vs non-work: Non-work messages grew faster than work messages; share of non-work rose from 53% (June 2024) to 73% (June 2025). Daily volume estimates (7-day averages): June 2024—Non-work 238M, Work 213M, Total 451M; June 2025—Non-work 1,911M, Work 716M, Total 2,627M. Topics: The three most common topics—Practical Guidance, Seeking Information, and Writing—collectively account for ~77–78% of usage. End-period shares (Figure 7): Practical Guidance 28.8%, Seeking Information 24.4%, Writing 23.9%, Multimedia 7.3%, Self-Expression 5.3%, Other/Unknown 5.2%, Technical Help 5.1%. In aggregated one-year breakdown: Practical Guidance 28.3%, Seeking Information 21.3%, Writing 28.1%, Technical Help 7.5%, Multimedia 6.0%, Self-Expression 4.3%, Other/Unknown 4.0%. Within Writing, two-thirds are modifications of user-provided text (editing, critiquing, translating, summarizing). Education-related use is substantial: Tutoring or Teaching constitutes 10.2% of all messages and 36% of Practical Guidance. Programming and companionship: Computer programming accounts for 4.2% of messages; Relationships and Personal Reflection 1.9%; Games and Role Play 0.4%. Work-related topics: Writing dominates work messages (41.8%); Practical Guidance 24.0%; Seeking Information 12.7%; Technical Help 12.7%; Multimedia 5.4%; Other/Unknown 3.9%; Self-Expression 0.8%. Intent: Overall, Asking ~49%, Doing ~40%, Expressing ~11%; over time Asking increased and Doing decreased (by late June 2025: Asking 51.6%, Doing 34.6%, Expressing 13.8%). In work messages, Doing ~56% vs Asking ~35% and Expressing ~9%; nearly 35% of all work-related queries are Doing-Writing. O*NET GWAs (all messages): Top GWAs—Getting Information 19.3%; Interpreting Meaning 13.1%; Documenting/Recording 12.8%; Providing Consultation 9.2%; Thinking Creatively 9.1%; Making Decisions and Solving Problems 8.5%; Working with Computers 4.9%; seven combined 76.9%. O*NET GWAs (work messages): Documenting/Recording 18.4%; Making Decisions and Solving Problems 14.9%; Thinking Creatively 13.0%; Working with Computers 10.8%; Interpreting Meaning 10.1%; Getting Information 9.3%; Providing Consultation 4.4%; seven combined ~81%. Interaction quality: Good interactions grew faster than Bad; by mid-2025 Good ~57.8%, Unknown ~28.7%, Bad ~13.5%; good-to-bad ratios by topic—Self-Expression 7.86, Seeking Information 4.75, Practical Guidance 4.37, Other/Unknown 4.04, Writing 3.11, Multimedia 2.80, Technical Help 1.95; by intent—Asking 4.45, Expressing 3.67, Doing 2.76. Demographics and diffusion: Early users skewed male (~80% masculine names), declining to ~48% by June 2025 (parity or slight female majority); nearly half of adult messages are from users aged 18–25; work-related share increases with age (approx: 26–35 29.1%; 36–45 31.4%; 46–55 30.2%; 56–65 27.1%; 18–25 22.5%; 66+ 16.1%). Faster adoption growth in low- and middle-income countries (GDP per capita $10k–$40k). Education and occupation: Work-related shares—<Bachelor 37%, Bachelor 46%, >Bachelor 48% (differences remain significant after controls). Writing share rises with education (adjusted ~31% for Bachelor and >Bachelor). In occupations, work-related shares are higher in computer-related (57% unadjusted; 53% adjusted), management/business (50% unadjusted; 47% adjusted), engineering/science (48% unadjusted; ~47% adjusted) vs non-professional (40% unadjusted; ~43% adjusted). Among work messages, Asking is relatively more common in technical/professional roles; Writing dominates management/business (52%) and is high in non-professional and other professional roles (~50% and 49%); Technical Help is concentrated in computer-related (37%).
Discussion
Findings indicate that ChatGPT’s primary economic value lies in decision support—helping users obtain, interpret, and apply information to make better choices. This aligns with the high prevalence of Asking intents and O*NET activities focused on information and decision-making across diverse occupations. Writing’s dominance in work-related use underscores generative AI’s comparative advantage in creating and transforming digital outputs common to white-collar work. The relative scarcity of programming and companionship topics, compared to some studies of other chatbots, suggests platform differences in user bases and use cases. Non-work usage growing faster than work suggests large consumer surplus, consistent with willingness-to-pay estimates (e.g., Collis and Brynjolfsson, 2025). Differences by education and occupation—especially higher shares of Asking in professional roles—support models where AI acts as a co-pilot that augments human problem-solving rather than solely a task-performing co-worker. Demographic trends show narrowing gender and age gaps and diffusion into lower- and middle-income countries, pointing to broadening access and use. Overall, ChatGPT appears to complement knowledge work by improving information access, guidance, and communication quality, potentially increasing productivity through better decision-making.
Conclusion
The paper provides the first economics-based analysis using internal ChatGPT message data with a novel privacy-preserving approach. Eight main facts are documented: (1) non-work usage now constitutes about 70% and is growing faster than work usage; (2) Practical Guidance, Writing, and Seeking Information comprise roughly 78% of all messages; (3) Writing is the dominant work-related activity (about 42%), largely modifying user-provided text; (4) Asking, Doing, Expressing shares are about 49%, 40%, and 11% respectively, with Asking rising over time and receiving higher satisfaction ratings; (5) gender gaps have narrowed substantially, reaching near parity; (6) nearly half of adult messages are from users under 26; (7) adoption has grown especially in low- and middle-income countries; (8) educated users and those in professional occupations use ChatGPT more for work and more for Asking at work. The results suggest substantial welfare gains from non-work usage and productivity improvements via decision support in knowledge-intensive roles. Future research could examine causal impacts on job performance across occupations, long-term effects on skill development (e.g., writing and analytical skills), interactions with enterprise deployments, and cross-platform comparative studies to understand differences in use patterns and outcomes.
Limitations
Analyses are restricted to consumer ChatGPT plans (Free, Plus, Pro) and exclude Business/Enterprise/Education plans; logged-out users are excluded due to inconsistent availability over the sample period. Users who opted out of training, those self-reporting under 18, deleted accounts, and banned users are excluded. Employment and education linkages are available only for a subset of users via a data clean room with strict aggregation thresholds, potentially limiting representativeness. Automated LLM-based classifiers infer intent and topic without ground truth, introducing classification error; messages are truncated (5,000 characters) and interpreted with limited context (up to 10 prior messages). Some categories are suppressed when fewer than 100 users meet thresholds, and counts of distinct accounts may exceed distinct individuals. Comparisons to other platforms may reflect differences in user populations and features. Overall estimates rely on sampling and reweighting, which may introduce measurement error.
Related Publications
Explore these studies to deepen your understanding of the subject.

