logo
ResearchBunny Logo
How to improve SME performance using iterative random forest in the empirical analysis of institutional complementarity

Business

How to improve SME performance using iterative random forest in the empirical analysis of institutional complementarity

A. Sannabe

This innovative research by Atsushi Sannabe explores how institutional complementarity influences the success of SMEs, uncovering key management strategies through advanced machine learning techniques. Discover the insights gained from analyzing management quality indicators that significantly impact profitability and growth.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the difficulty of empirically identifying institutional complementarities—interactions among multiple organizational practices that jointly influence performance—using traditional econometric methods. Prior theory suggests complementarities can span higher-order interactions among several practices, which are hard to pre-specify and estimate with regression due to combinatorial explosion and multicollinearity. Decision tree–based machine learning, particularly iterative random forests (iRF), can recover stable, higher-order interactions. The purpose is to visualize and empirically uncover how combinations of management practices relate to superior firm outcomes (profitability, growth, and viability) in SMEs, extending beyond what conventional regression can capture, using WMS data.
Literature Review
The paper discusses the theoretical foundations and empirical challenges of institutional complementarity, drawing on economics and management literature. Brynjolfsson and Milgrom (2013) and Roberts (2007) emphasize that complementarities often involve multi-factor interactions, complicating empirical tests. Boon et al. (2019) review HRM systems and their measurement. Athey and Stern (1998) detail challenges in testing complementarity. Basu et al. (2018) introduce iRF to identify stable high-order interactions in complex systems (e.g., gene regulation), illustrating its applicability beyond simple pairwise effects. Subsequent discussion connects findings to the literature on High Involvement Work Systems (HIWS), with Burdin and Kato (2021) arguing that high-performing firms cluster multiple complementary practices around opportunities, incentives, and abilities/skills. The paper positions iRF as a tool to detect such clusters in organizational settings.
Methodology
Data: World Management Survey (WMS) as in Bloom et al. (2012), focusing on SMEs. After excluding observations with missing values, N = 6,339 firms. Three binary target variables were constructed: (1) Profitability: ROCE_1 = 1 if Return on Capital Employed (ROCE) is ≥ mean + 1 SD, else 0. (2) Growth: D5SALES_1 = 1 if 5-year sales growth is ≥ mean + 1 SD, else 0. (3) Viability: DEAD = 1 if the firm was liquidated/bankrupt, else 0. Predictors: 18 management practice dimensions (lean1–lean2; perf1–perf10; talent1–talent6) covering operations/lean, performance monitoring and targets, and talent management. Analysis pipeline: - Random Forest (RF): Implemented in R using randomForest v4.6-14. For each target (ROCE_1, D5SALES_1, DEAD), RF models were trained on the 18 indicators. The mtry parameter was tuned via tuneRF to minimize the out-of-bag (OOB) error; number of trees ≥100 (default used) yielded stable estimates. Feature importances were computed (Gini importance) and ranked. - Iterative Random Forest (iRF): Implemented in R using iRF v2.0.0. Parameters: cutoff.unimp.feature = 0.3 (retain features in top 70% by importance for interaction discovery), n.bootstrap = 20 to compute stability scores of interactions; other settings left at defaults given similar tuning outcomes to RF. iRF repeats weighted RF training, then applies a random intersection trees step to identify stably co-occurring feature sets (interactions). The top 63 interactions by stability were reported for each target. Variables (abbreviated descriptions): lean1–lean2 (modern manufacturing techniques, rationale); perf1–perf10 (process documentation, performance tracking/review/dialogue, consequence management, target balance/interconnection/time horizon/stretching, performance clarity); talent1–talent6 (managing, rewarding, removing poor performers, promoting, attracting, retaining human capital). Descriptives: ROCE mean 15.52 (SD 15.54); D5SALES mean 0.39 (SD 0.68); DEAD mean 0.02. Firm size measures (log sales, log employees) and country/industry distributions are provided for context.
Key Findings
Random Forest results (feature importance, top items recurrent across targets): - Across ROCE_1, D5SALES_1, DEAD, the top indicators were notably: talent3 (removing poor performers), perf10 (performance clarity), talent6 (retaining human capital), perf8 (target time horizon), and talent2 (rewarding high performance). - OOB error rates: ROCE_1 = 15.54%; D5SALES_1 = 9.88%; DEAD = 1.88%. Iterative Random Forest (iRF) interactions (stability scores indicating frequency of stable co-occurrence across bootstraps): - ROCE_1: Highest-stability pairs included perf8_talent2 (1.00), perf8_talent3 (1.00), talent2_talent3 (1.00), as well as perf8_perf9 (0.95), perf8_perf10 (0.95), perf9_talent3 (0.95), and broader clusters linking perf8 (long-term perspective) with performance measurement and talent practices. - D5SALES_1: Strong complementarities centered on talent2 (rewarding high performance) with other practices: perf9_talent2 (1.00), talent2_talent3 (1.00), perf8_talent2 (1.00), talent2_talent6 (1.00), and talent2_talent4 (1.00), plus interactions tying performance measurement (e.g., perf5/perf10) with talent policies. - DEAD: Stability highest for combinations integrating long-term target orientation and performance clarity: perf8_perf10 (1.00), with additional complementarities involving perf8 and talent/performance tracking (e.g., perf8_talent3 = 0.90; perf8_talent6 = 0.80; perf2_perf8 = 0.75). Synthesis: The ability to set short-term goals aligned with long-term objectives (perf8) is a central complementary node, especially for profitability (ROCE_1) and viability (DEAD). Rewarding high performers (talent2), removing/retraining poor performers (talent3), retaining human capital (talent6), and clear performance measures (perf10) are consistently important and often appear in stable interactions. For growth (D5SALES_1), rewarding high performers (talent2) is particularly pivotal and complementary to multiple performance management dimensions.
Discussion
The findings substantiate the presence of institutional complementarities among management practices in SMEs. Machine learning uncovers higher-order interactions that align with theory: performance management systems (clear metrics, target setting with long-term orientation) complement talent management practices (rewarding high performers, addressing poor performance, retention). These clusters mirror High Involvement Work Systems characterized by opportunities, incentives, and abilities/skills (Burdin & Kato, 2021). The centrality of perf8 indicates that designing short-term targets within a long-term framework enhances the effectiveness of incentive systems and performance measurement, with implications for profitability and survivability. For growth, incentive alignment (talent2) appears especially consequential, working in tandem with performance tracking and clarity. Overall, iRF provides an interpretable map of stable, multi-practice complementarities, advancing empirical analysis beyond traditional regression approaches.
Conclusion
This study demonstrates that iterative random forests can effectively detect and visualize stable, higher-order complementarities among management practices using WMS data. Key practices associated with superior performance include: rewarding high performers (talent2), reassignment/retraining or removal of poor performers (talent3), clear performance criteria (perf10), retention of human capital (talent6), and especially setting short-term goals grounded in a long-term vision (perf8), which exhibits broad complementarities. These insights support theory on complementary organizational systems and provide practical guidance for prioritizing improvement efforts. Future research should address endogeneity and further validate whether the selected 18 indicators comprehensively capture institutional interdependencies, potentially extending to causal designs and broader organizational contexts.
Limitations
The study notes two main limitations: (1) Endogeneity: while field experiments (Bloom et al., 2013) support causal effects of management practices, endogeneity remains a concern in observational analyses. (2) Coverage of practices: although the 18 WMS indicators span diverse management areas, it remains to be assessed whether they fully capture the complex interrelationships across organizational institutions. Addressing these issues and expanding the scope of indicators are priorities for future work.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny