logo
ResearchBunny Logo
Exploring textual-visual strategies in internet-based light food advertising: a study of Taobao advertisements in China

Business

Exploring textual-visual strategies in internet-based light food advertising: a study of Taobao advertisements in China

Q. Yong and X. Rao

Explore the striking advertising strategies for light food products on Taobao, as revealed by researchers Qian Yong and Xiaoqin Rao. This study uncovers intriguing visual and textual techniques and highlights potential consumer misconceptions in the exciting world of digital marketing!... show more
Introduction

The study situates light food advertising within the broader expansion of multimodal mass media and its influence on public perceptions of health and beauty. Prior research shows advertising affects perceptions of healthiness and diet in varied contexts. Against increasing attention to health (e.g., COVID-19, obesity) and sociocultural pressures tied to “thin culture,” China has seen a notable rise in “light food” marketed as healthy and slimming, especially on Taobao. The paper investigates how light food ads shape notions of health and desirability and whether advertised health claims align with nutritional content. Using a Textual-Visual Thematic Analysis (TVTA) framework, the study targets Taobao ads for meal-replacement-type light foods to examine both text and visuals. It is guided by three research questions: (RQ1) What salient visual features characterize Taobao light food ads? (RQ2) What predominant textual features appear in these ads? (RQ3) How do visual and textual elements interact to construct the image of light foods on Taobao?

Literature Review

The paper builds on multimodal discourse analysis (MDA) and its critical turn to Multimodal Critical Discourse Analysis (MCDA), emphasizing how semiotic resources convey ideology and power (Kress and van Leeuwen; Machin and Mayr; Hart; others). Visual Grammar (VG) by Kress and van Leeuwen offers tools to analyze representational, interactive, and compositional meanings but has been critiqued for limited insight into intentions. TVTA integrates textual and visual thematic analysis through systematic coding and theme development (Braun and Clarke), with applications across psychology, design feedback, and news analysis. While prior studies often examined text and visuals unevenly, TVTA seeks coordinated analysis. The authors note interpretive subjectivity as a limitation of TVTA and therefore combine qualitative and quantitative analyses (including social network analysis of co-occurrences) to enhance rigor.

Methodology

Targets: The study focuses on Chinese “light food” sold online, framed as low energy/calorie/fat, sugar-free, etc., promising satiety and nutrition while aiding weight control. Categories considered are meal-replacement products: chicken breast, whole wheat bread, meal-replacement shakes or porridge, and soba noodles. Data collection and sampling: From Taobao, the top 10 selling items in each of five categories were collected using monthly sales rankings from June–September 2022. After removing brand duplicates across price variants and restricting to brands specializing in light food, 50 advertisements remained. The original 765 images were filtered to remove irrelevant content (e.g., shipping notices, nutrition facts tables), yielding 633 images, combined into 50 long images and numbered 1–50. OCR and manual transcription produced a 45,817-word Chinese corpus for textual analysis. Analytic framework and tools: The study employs TVTA with Visual Grammar to code representational (narrative vs conceptual processes), interactive (contact, social distance, perspective, modality), and compositional (information value, salience, framing) meanings. Qualitative analysis interprets textual themes and visual arrangements; quantitative analysis provides descriptive statistics on frequencies/distributions. Social Network Analysis (SNA) visualizes keyword and visual-code co-occurrence matrices using Gephi 0.10.1. Nvivo 12 Plus is used to code PDFs of the long images and to export co-occurrence data for Gephi visualization. Co-occurrence matrices depict relationships, where frequent co-appearances are represented by larger nodes and thicker edges. Procedures: Steps included bulk downloading, selection and combination of images, OCR and standardized Chinese transcription, PDF conversion for VG coding, keyword extraction, computation of frequencies for VG categories, and SNA visualization of keyword and visual-code co-occurrence networks.

Key Findings
  • Representational meaning: Conceptual representations dominate over narrative. Narrative processes: action (28.91% of 633 images), reactional (11.69%), verbal (2.21%). Conceptual processes: classification appears in 224 images (35.39%); analytical (83.25%) and symbolic (71.09%) are most frequent. Advertisements use classificational displays (e.g., flavors/options), analytical highlights (e.g., low GI, attributes), and symbolic credentials to objectify abstract qualities and promote sales.
  • Interactive meaning (Table 2): Offer act dominates (94.47%), with minimal demand (5.53%). Social distance favors medium shots (78.83%), then close-ups (17.69%), long shots (4.90%). Perspectives: frontal angle (63.35%) over oblique (37.28%); high angle (54.98%) and eye-to-eye (44.87%) are common; low angle is rare (0.95%). Modality: high (82.46%), middle (13.43%), low (3.32%). These choices create a sense of objective, detailed presentation while subtly involving viewers.
  • Compositional meaning (Table 3): Information value primarily top-bottom (90.21% of images), with left-right (23.70%) and center-margin (18.64%). Salience: color (95.42%) and size (91.31%) are principal emphasis devices; position (16.43%) and brightness (2.84%) are less used. Framing: color differentiation (88.31%), image outlines (52.29%), empty spaces (35.39%), lines (20.06%). A frequent left-right usage contrary to classic expectations (left=new/right=given) was observed; top-bottom and left-right align with portrait-screen reading habits.
  • Textual keywords (45,817-word corpus): Top items include kcal (222), calorie (210), nutrition (197), satiety (157), protein (136), content (134), diet (130), fiber (121), whole wheat (115), meal replacement (112), data (107), taste (98), 100g (95), ingestion (90), fat (88), addition (82), bread (82), China (77), minute (76), health (75).
  • Keyword co-occurrence clusters (Gephi SNA): Three clusters dominate: reducing calorie ingestion (e.g., kcal–calorie–ingestion triangle: co-occurrences 78, 42, 29 respectively; minute and apple used to concretize calorie narratives), nutrition and satiety (emphasis on content, dietary fiber, protein; satiety duration comparisons via hours), and citing data sources (authority references: Chinese Food Composition Table Standard Edition, 29 occurrences; Chinese Dietary Guidelines for Residents, 16 occurrences). Numbers and data convey scientific rigor even when references are small or caveated as “for reference only.”
  • Visual co-occurrence matrices: Analytical and symbolic processes co-occur densely with interactive and compositional cues (offer act, medium shot, frontal/high angle, high modality; top-bottom layout, size, color, outlines, color differentiation). This combination balances an ostensibly objective informational stance with persuasive salience and involvement.
  • Contradictions: Visual allure and portion depictions may conflict with textual health claims (e.g., dressings high in sugar/sodium not visually disclosed; exaggerated serving visuals vs small labeled servings; ambiguous terms like “low-fat,” “reduced-calorie”; nature/ethical imagery without concrete ingredient or sourcing details). Overall, there is a tension between claimed consumer freedom and subtle manipulation through visual-textual orchestration.
Discussion

Findings address RQ1 by showing dominant conceptual visuals (analytical and symbolic processes), medium shots, frontal/high angles, and high modality alongside salient compositional cues (top-bottom organization, bright colors, large sizes). For RQ2, textual content centers on calories, nutrition, protein, satiety, and data, with co-occurrence patterns narrating a persuasive logic: possibility of calorie reduction, mechanisms via nutrition and satiety, and credibility through authority citations. For RQ3, the integrated TVTA and SNA reveal how visuals and text jointly construct an image of light foods as objective, data-driven, nutritious, and slimming, promising effortless weight loss. The visual grammar choices (offer act, comfortable social distance, engaging perspectives) and compositional salience (color, size, framing) subtly steer attention and interpretations while maintaining a veneer of neutrality. This interplay has implications for consumer perception, potentially reinforcing “thin culture” as health and normalizing fussy calorie calculation. The results highlight the need for critical media literacy and more transparent nutritional communication in e-commerce food marketing.

Conclusion

The study contributes a combined textual-visual thematic analysis of Taobao light food advertising using Visual Grammar, corpus keywords, and co-occurrence SNA. It identifies a consistent multimodal strategy: advertisers foreground conceptual, data-tinged objectivity (analytical and symbolic processes, offer acts) while employing engaging camera work and strong salience (medium shots, frontal/high angles, high modality, bright colors, larger elements, structured layouts) to persuade. Textual narratives emphasize calorie reduction, nutrition, and satiety, often bolstered by selectively presented data and authority references. Together, these strategies idealize healthy slimming and subtly conflate thinness with health and self-improvement. Future research could broaden product scopes beyond meal replacements (e.g., beverages, snacks, energy products), incorporate larger and more representative brand samples, include more coders to enhance reliability, and examine additional semiotic modes (color dynamics, sound, video, interactivity) using interdisciplinary methods from psychology and communications.

Limitations

The study is limited by scope and potential subjectivity: only 50 advertisements (633 images) from meal-replacement categories were analyzed, based on a specific 2022 time window. Coding involved one co-worker, introducing interpretive subjectivity. The sampling may not capture all light food types or leading brands. Future work should expand to more categories (beverages, snacks, energy products), adopt more rigorous sampling (e.g., brands with large follower bases), involve more coders to improve reliability, and evaluate other semiotic modes (color, sound, video, interactivity) with tools and theories from related disciplines.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny