Economics
Estimating digital product trade through corporate revenue data
V. Stojkoski, P. Koch, et al.
The study addresses the challenge of defining and measuring digital trade, particularly for digital goods, productized services, and digital intermediation fees—collectively termed digital products. Existing official statistics struggle to capture the breadth and delivery modes of digital trade, and there are discrepancies between statistical conventions (which often treat goods and services differently) and trade agreement terminology on digital products. The authors motivate the need for improved measurement by highlighting implications for trade balances (e.g., the U.S. being a net exporter of digital products and net importer of physical goods), for understanding economic decoupling of greenhouse gas emissions from growth, and for refining measures of economic complexity that typically rely on physical trade data. The purpose of the paper is to develop a bottom-up, firm-revenue-based method to estimate bilateral trade in digital products across multiple sectors and countries, thereby providing granular, sector-aligned insights into the scale, geography, growth, and economic implications of digital product trade.
Data and scope: The authors construct bilateral trade estimates for 189 countries and 31 digital product sectors for 2016–2021 by combining (i) corporate revenue data for large digital firms (revenues ≥ USD 1B) and key app/game developers, and (ii) consumption data for mobile applications and games. Digital products are grouped into digital goods (e.g., software, games), productized services (e.g., cloud computing, digital advertising, streaming), and intermediation fees (e.g., marketplace commissions).
Revenue data: Starting from 187 parent companies and their 2,502 subsidiaries (15,515 total firms including app developers), revenues are collected primarily from Orbis, with Statista and other public sources (e.g., annual reports) used to fill gaps. Parent-company consolidated revenues are adjusted by subtracting subsidiary revenues (except when subsidiary totals exceed parent reported values). Firms’ revenues are decomposed into 29 digital sectors using Statista’s Digital & Technology Market definitions; combined with two AppMagic categories (apps and games) yields 31 sectors. Subsidiaries are generally assumed to mirror parents’ revenue structures unless more specific information is available.
Consumption data: AppMagic provides consumption (USD) for mobile apps and games across 60 countries (2016–2021), covering 13,013 unique firms/developers. Where a firm’s origin is unknown, origin is set to the country generating the majority of its revenues.
Machine learning extrapolation: A gradient-boosted regression tree is trained to predict country-year consumption for each firm–sector pair, extending from the two observed sectors (apps/games) to 29 additional sectors and from 60 to 189 countries. Features are motivated by gravity models and include: (1) firm–sector global revenues, (2) total revenues of firms headquartered in the same origin country (across all digital sectors), (3) global consumption of the sector, (4) bilateral cultural/historical/geographic ties (language, borders, colonizer, distance, region), (5) GDP of origin and destination, (6) ICT penetration (fixed/mobile broadband, internet use) for origin and destination, (7) predicted probability of non-zero consumption (from a logistic model), and (8) year dummies. Features (except dummies) use log-transforms with +1 to handle zeros. Training uses firm–sector pairs with yearly revenues > USD 10M. Group K-fold cross-validation (5-fold, leaving out 20% firm–sector pairs) yields MSE 23.14, improving upon a baseline linear model (MSE 24.44). An independent validation using firms’ reported regional shares shows MSE 0.048 for the ML model (vs 0.126 for the linear model).
Post-processing and harmonization: Predictions are exponentiated to USD values (estimates < USD 1,000 are set to 0). Aggregates are constrained to match input firm–sector revenues, and normalized to align with known regional consumption shares from firms’ annual reports, assuming consistent regional shares across a firm’s categories.
Optimal transport allocation: To assign consumption to export origins, an optimal transport framework minimizes geographic distance between consuming country and the nearest subsidiary (or parent), subject to not exceeding the subsidiary’s revenue capacity. Excess demand is allocated to the next nearest subsidiary with remaining revenue. Cost weights are inversely proportional to great-circle distance. This method prioritizes domestic allocation, producing conservative (lower-bound) cross-border trade estimates. The study provides two assignment rules for export origin: (a) by headquarters location and (b) by fiscal residence of subsidiaries (the latter used in main figures), acknowledging tax-related location choices can affect geography of reported exports.
Uncertainty quantification: For each non-zero estimated bilateral revenue X_odp, the authors estimate 95% confidence intervals via year-specific linear regressions of log bilateral revenues on firm–sector total revenues and origin/destination fixed effects, with origin categories grouped to reduce standard errors, and normalized so predicted means match estimated revenues.
Comparative datasets: For context/validation, the study compares its aggregates and bilateral patterns with UNCTAD/WTO digitally delivered services (DDS) and services data (including Eurostat weights to map deliverable to delivered), WTO’s BATIS for bilateral services, and the OEC for physical goods.
- Scale and growth: Estimated trade in digital products rose from about USD 320–328B in 2016 to USD 956–958B in 2021 (95% CI for 2021: USD 835B–1.10T), growing at an annualized rate of ~24.5%. This outpaced DDS (~8%), services (~3.7%), and goods (~6.3%). During 2020, while services and goods contracted (−17% and −7%), digital products grew ~28% year-on-year.
- Share of world trade and composition: Digital products represented ~3.5% of global trade in 2021. The largest components were digital advertising (~29.9%), online marketplaces n.e.s. (~19.8%), and cloud computing (~16.2%), together accounting for ~65% of digital product trade.
- Geography and concentration: Exports of digital products are highly concentrated: about 80% originate from the top ~3% of countries; imports are more dispersed (80% go to <20% of countries), with concentration levels similar to physical goods imports. Export concentration exceeds that of DDS, services, and goods, and robustness checks show this is not merely due to fewer sectors. Export geography is dominated by the United States and, under subsidiary assignment, tax havens such as Ireland and Luxembourg.
- Correlations with other trade aggregates: Country-level exports (and imports) of digital products are strongly correlated with DDS (exports: r≈0.82; imports: r≈0.78), with lower correlations for services and goods.
- Network structure: Digital product trade networks resemble DDS more than services or goods, centered on the US, with tax havens more central than in other networks. The goods network centers around regional hubs (US, Germany, China), with China the most central.
- Trade balances: Comparing total (goods+services) balances with digital product balances identifies four quadrants. Using subsidiary assignment, countries with surpluses in both include Sweden, Ireland, Luxembourg, Singapore; with HQ assignment: Sweden, China, Singapore (suggesting Ireland/Luxembourg act as pass-throughs). Countries with goods/services surplus but digital product deficit include natural resource exporters (e.g., Saudi Arabia) and manufacturing hubs (e.g., Mexico). The US (and, in some cases, India, Uruguay, Netherlands, UK) shows a digital products surplus offsetting physical deficits.
- Decoupling and exports: Among high-income countries (>1.5M population), economies that decoupled GDP per capita growth from emissions per capita (2016–2019) tend to have larger per-capita exports of digital products (and DDS) than non-decoupled peers; for services and goods, the opposite tendency is observed.
- Economic complexity impacts: Incorporating digital product exports into HS4-based ECI/PCI calculations raises complexity for digital exporters (e.g., US, Ireland, Australia) and lowers it for manufacturing hubs (e.g., Mexico, Slovakia). Digital sectors exhibit higher average product complexity than physical goods; most complex include Digital Advertising and eBooks; least complex includes Online Food Ordering. The adjusted ECI performs similarly to traditional ECI in explaining future growth and emission intensity, despite limited time series.
The findings demonstrate that a bottom-up, firm-revenue-based approach can quantify bilateral trade in digital products at sectoral detail, revealing dynamics not captured by traditional services and goods statistics. Digital product trade is large, fast-growing, and more geographically concentrated on the supply side while broadly distributed on the demand side. This distinct geography and network structure underscore the importance of digital capabilities, corporate structures, and tax residence in shaping observed trade patterns. The results suggest digital exports can materially influence countries’ overall trade balances—helping explain cases like the United States, which runs large physical goods deficits but substantial digital product surpluses. The observed association between decoupling and higher digital product exports supports the notion that digitization can align with sustainability goals (twin transition), potentially by shifting activity toward higher value-added, lower-emission-intensity sectors and enabling efficiencies across the economy. Incorporating digital product exports into complexity metrics improves the representation of advanced economies’ productive structures and better captures knowledge-intensive activities. Overall, the study provides a more nuanced and policy-relevant view of digital trade, including insights relevant for measuring GATS Mode 3, informing debates on digital product classification and potential tariffs, and refining national accounts and sustainability assessments.
This paper introduces and validates a granular, bottom-up methodology to estimate bilateral trade in digital products using corporate revenue and consumption data augmented with machine learning and optimal transport. It delivers the first global, multi-sector estimates for 189 countries and 31 digital sectors (2016–2021), documenting rapid growth, distinct geography, and meaningful implications for trade balances, sustainability (via links to decoupling), and economic complexity. Contributions include: (i) a comprehensive dataset enabling sector-aligned analyses (e.g., cloud computing, digital advertising), (ii) methodological advances in combining firm revenues, consumption extrapolation, and conservative transport-based allocation, and (iii) empirical insights on concentration, network structure, and macroeconomic links. Future research should expand coverage to smaller firms and additional digital sectors (e.g., new AI services), develop longer time series, incorporate more sector-specific bilateral consumption data (especially B2B categories), refine origin assignment and parent–subsidiary flow tracking, and investigate the environmental impacts of digital infrastructure and services using broader system boundaries and counterfactual comparisons.
- Coverage bias and lower-bound estimates: The dataset focuses on large firms (revenues ≥ USD 1B) and selected sectors, excluding many SMEs and some emerging digital categories (e.g., AI chatbots), leading to conservative (lower-bound) trade volume estimates and potentially overstated growth/concentration due to frontier-firm dynamics.
- Extrapolation from limited consumption data: Bilateral consumption observations exist only for apps and games; these consumer-oriented patterns are used to extrapolate to 29 additional sectors, including B2B areas (e.g., cloud computing), which may have different demand drivers.
- Allocation assumptions: Optimal transport prioritizes domestic allocation and assigns remaining flows to geographically closest subsidiaries, which may not reflect actual digital delivery frictions; digital trade’s minimal physical constraints may violate gravity-like patterns.
- Origin assignment and tax residence: Two origin rules (headquarters vs. fiscal residence of subsidiaries) yield different geographies, and neither fully captures true production locations; tax planning and pass-through jurisdictions (e.g., Ireland, Luxembourg, Cayman Islands) can distort apparent export origins.
- Parent–subsidiary flows unobserved: Lack of transactional data between parents and subsidiaries prevents precise attribution of value creation and revenue booking across entities.
- Limited time span: Data cover only 2016–2021, constraining longitudinal analyses and assessments of structural change over longer horizons.
- National accounts comparability: EBOPS/ISIC do not distinguish digital vs. physical delivery channels, limiting direct comparability and mapping across statistical frameworks.
Related Publications
Explore these studies to deepen your understanding of the subject.

