logo
Loading...
Comparing content marketing strategies of digital brands using machine learning

Business

Comparing content marketing strategies of digital brands using machine learning

Y. Chen

This study, conducted by Yulin Chen, explores innovative content marketing strategies utilized by major digital brands like Disney and Netflix through the lens of machine learning. Uncover insights into public participation dynamics, including likes and shares, to enhance your marketing approach.... show more
Introduction

The study examines how digital brands can leverage social media content to enhance engagement and competitive intelligence. Motivated by the growth and complexity of social media data, it proposes an integrated framework that combines unstructured text analysis and ensemble machine learning to extract key cues from brand fan page posts. The research focuses on six globally known digital brands (Disney, Netflix, Microsoft, Google, YouTube, LinkedIn) and addresses two questions: (A) Do fan pages of digital product versus digital platform brands contain key cues in posts that improve brand image and marketing promotions? (B) Can public preferences for brand content be identified via interactive data to analyze and predict competitor behaviors? The study positions likes, comments, and shares as indicators of popularity, comment engagement, and virality, respectively, to quantify public behavioral responses.

Literature Review

The literature highlights that social media text carries rich, unstructured signals (including images, videos, and emoticons) that add value for text mining. Compared with structured data mining, text mining requires preprocessing to convert irregular, unstructured text into analyzable forms. Prior competitive analyses on social media have used statistical analysis, content analysis, text mining, sentiment analysis, clustering, and association analysis to identify trends and inform decision-making. Engagement on platforms like Facebook and Instagram is commonly operationalized via likes (popularity), comments (comment engagement), and shares (virality). Content typologies often distinguish rational/informational content, interactive/community content, and marketing/reward content, each affecting engagement differently. Research also differentiates passive (low) versus active (high) engagement, noting that likes can reflect emotional responses, comments indicate deliberation, and shares reflect recognition or diffusion. Studies find that entertainment and visually appealing content often boost likes and shares, while the effects on comments can be mixed. Competitive monitoring of both own and rival brand pages can yield insights into strengths, weaknesses, public attitudes, and market trends, supporting more effective social media strategies.

Methodology

Design: An integrated framework for social media monitoring and data exploration combining unstructured text analysis, text mining, and ensemble learning (Random Forest, AdaBoost, XGBoost). The study maps engagement on Facebook via likes (popularity), comments (comment engagement), and shares (virality). Sample: Six global digital brands—digital product brands: Disney, Netflix, Microsoft; digital platform brands: Google, YouTube, LinkedIn. Data Collection: Facebook Graph API used to collect official fan page posts and interaction metrics (type, time, likes, comments, shares). Timeframe: January 1, 2011 to December 31, 2020. Total posts: 29,343 (digital products: 14,028; digital platforms: 15,315). Preprocessing: Removal of links, usernames, special characters; tokenization; elimination of stop words and repetitions; semantic similarity analysis to cluster topics and discard unrelated words; selection and verification of the first 50 high-importance keywords per brand. Labels: Interactions dichotomized into active (above-average interaction) vs. passive (below-average interaction); tokens labeled as positive (1) or passive (0). Models: Three classifiers (Random Forest, XGBoost, AdaBoost) trained on tokenized datasets; cross-validation performed individually and in ensemble to avoid overfitting and optimize parameters; evaluation via TP/TN/FP/FN and statistical validation. Reliability/Validity: Exploratory factor analysis on keywords; KMO and Cronbach’s alpha reported for digital product keywords (KMO=0.849; total variance explained 80.292%; α=0.883) and digital platform keywords (KMO=0.884; total variance explained 77.545%; α=0.881), indicating good reliability and convergent/discriminant validity.

Key Findings

Hypotheses outcomes (likes=popularity; comments=comment engagement; shares=virality): • H1a (platforms, popularity): Partially established. • H1b (products, popularity): Established. • H2a (platforms, comment engagement): Partially established. • H2b (products, comment engagement): Partially established. • H3a (platforms, virality): Not established. • H3b (products, virality): Established. Statistical highlights: Digital product brands: Likes significant for Random Forest (R=0.017; F-change=4.423; β=0.017; t=−2.103; p=0.035). Comments significant for Random Forest (R=0.020; F-change=6.206; β=0.020; t=−2.491; p=0.013). Shares not significant across models. Digital platform brands: Likes significant across RF (R=0.161; F-change=372.927; β=0.161; t=−19.311; p<0.001), XGBoost (R=0.156; F-change=348.462; β=0.156; t=−18.667; p<0.001), AdaBoost (R=0.099; F-change=138.411; β=0.099; t=−11.765; p<0.001). Comments significant for XGBoost (R=0.021; F-change=60.038; β=0.021; t=−2.457; p=0.014) and AdaBoost (R=0.025; F-change=8.641; β=0.025; t=−2.940; p=0.003); not for RF. Shares significant for RF (R=0.040; F-change=22.791; β=0.040; t=−4.774; p<0.001), XGBoost (R=0.055; F-change=42.639; β=0.055; t=−6.530; p<0.001), AdaBoost (R=0.047; F-change=31.517; β=0.047; t=−5.614; p<0.001). Key cue insights by brand: • Disney (product): Cues influencing likes/shares across models include Disney, trailer, new/now, http, theatre; AdaBoost highlights Disneyland, Pixar, story, history. Emphasize updates on movies and park-related content. • Netflix (product): “Netflix has it,” here, today, available, watch; AdaBoost: photo, http, October. Real-time and original content cues boost engagement and sharing. • Microsoft (product): XGBoost: http, Windows, Lumia, new, camera; AdaBoost: RSVP, party, Store, live, capture/smartphone. Exclusive product info and practical/functional cues increase shares. • YouTube (platform): MV, video, http, YouTube; AdaBoost: dance, music, best. Media/video-centric and culturally resonant keywords drive participation. • LinkedIn (platform): headlines, http, LinkedIn, work, career/job, opportunity, week/most. Work/opportunity-oriented cues stimulate comments/shares. • Google (platform): No significant results reported. Strategic modules identified: • Notification and diversion: Emphasizes updates, offers/discounts to attract attention (exemplified by Disney). • Interaction and diversion: Encourages comments/shares for self-expression and diffusion (exemplified by YouTube). • Notification, interaction, and diversion: Combines likes, comments, and shares to build consistent brand awareness (exemplified by Netflix, Microsoft, LinkedIn).

Discussion

Findings demonstrate that machine-recommended, image/keyword cues within official brand posts can measurably influence public engagement behaviors across likes, comments, and shares. Comparing digital product and platform brands reveals nuanced differences: product brands more consistently benefit in popularity and virality when employing such cues, while platforms show mixed results, particularly for virality. The identification of three operational content modules provides a practical lens for aligning content strategy with desired engagement outcomes. By continuously mining and benchmarking own and competitors’ posts, brands can detect public preferences, refine content blueprints, and respond promptly to trends, strengthening competitive positioning and brand image on social media.

Conclusion

The study contributes an integrated, AI-driven framework combining text mining and ensemble learning to extract key engagement cues from large-scale Facebook fan page data. It validates that specific image/keyword cues can enhance popularity, comment engagement, and virality for many digital brands and delineates three strategic content modules guiding brand page operations. The approach enables ongoing social monitoring, competitor analysis, and content optimization. Future research should extend datasets temporally and across regions/languages, include additional platforms, and enhance sampling strategies. Strengthening unstructured information processing and establishing standardized, culture-sensitive engagement indicators will further improve the generalizability and practical utility of social media content analyses.

Limitations

Data access is constrained by Facebook API time windows and may be influenced by popular topics during the sampled period. The sample focuses on well-known brands with predominantly English content, limiting generalizability across regions/languages. Additional platforms beyond Facebook should be analyzed for broader applicability. Data imbalance may persist; applying oversampling/undersampling and alternative sampling strategies can improve model robustness. Manual validation steps, while valuable, may introduce subjectivity and scalability challenges.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny