logo
ResearchBunny Logo
Women's online opinions are still not as influential as those of their male peers in buying decisions

Business

Women's online opinions are still not as influential as those of their male peers in buying decisions

O. Fan-osuala

This research by Onochie Fan-Osuala explores the intriguing question of whether women's online product reviews hold as much sway as men's in shaping consumer decisions. Findings reveal a bias against women's opinions, especially for male-gender-typed products, raising fascinating implications about online influence.... show more
Introduction

The paper investigates whether gender-based differences exist in how people evaluate product opinions (online reviews) and whether evaluations favor a particular gender for gender-typed products. Prior work shows persistent gender inequality in speaking and opinion sharing, with women being interrupted more, speaking less in legislatures, and feeling their opinions are diminished. People frequently rely on others’ opinions when making buying decisions, but it is unclear whether gender still influences whose opinions are valued. Given that gender is often used as a judgment heuristic and past studies show male-authored content and male-voiced information are rated more highly, the authors examine: (1) whether people are less likely to value a woman’s opinion on a product relative to men; (2) whether evaluations favor the gender stereotypically associated with the product; (3) whether any differences are driven by in-group bias; across search and experience goods. Online reviews provide an ideal empirical context given their widespread use in purchase decisions and prevalence in online shopping. The paper reports three studies: Study 1 (experiment), and Studies 2 and 3 (field data from Yelp and Amazon).

Literature Review

The introduction situates the study within literature documenting gendered disparities in voice and evaluation. Women are interrupted more, speak less in academic and legislative contexts, and often believe their opinions are devalued. Studies indicate preferential evaluations for male-associated sources: male authors and professors receive higher ratings; identical entrepreneurial pitches and lectures are rated higher when attributed to men; and male-voiced computer outputs are judged more credible and influential. The paper also notes the use of gender as a heuristic and implicit biases that may shape evaluations, suggesting such biases could extend to online product opinions. This literature motivates testing for gender-based differences in perceived helpfulness and influence of reviews, including within gender-typed product categories.

Methodology

Study 1 (Experiment): Conducted on Amazon Mechanical Turk with IRB approval. N=216 adults (108 men, 108 women), Mean age=40.6 (SD=10.8), all US-based with online shopping experience (Mean=12.6 years). Stimuli were review-like displays with product image, a gendered avatar and first name (e.g., Mary/Grace vs. Richard/William) to signal contributor gender; pretest confirmed perfect gender inference (Fleiss Kappa=1.00). Products were selected via pretest (12 participants, 7-point scale) to represent gender-typed and neutral: toothbrush (neutral), baby care kit (female-typed), tool kit (male-typed). A 2 (valence: positive vs. negative) x 2 (contributor gender: women vs. men) design was used; valence included to rule out valence-driven effects. Two filler reviews (orange juice, flip-flops) with less-gendered names were included. Procedure: Each participant saw five reviews (positions 2 and 4 were fillers) all from one treatment group (e.g., positive-by-women), with varying names to avoid single-author impression. After each review, participants rated helpfulness and likelihood of influencing purchase on 9-point Likert scales; helpfulness scale (3 items) had α=0.96. Analysis: Repeated-measures ANOVA with Huynh-Feldt correction tested main effects and product-type interactions; data were split by participant gender to assess in-group bias.

Study 2 (Field—Yelp, experience goods/services): Collected reviews in the Nightlife category from a major Southeastern US city, January–May 2015. Total 7,626 reviews by 3,854 unique users. Variables included contributor name, number of “useful” votes (DV), star rating, contributor status (elite), number of friends, number of reviews, presence of picture, check-in, review length, and time. Contributor gender inferred via first-name-based machine learning (genderizeR), retaining only names with ≥99% gender-probability; 2,399 reviews remained (men=902, women=1,497). A matched sample pairing each woman-authored review with a similar man-authored review was also created (N=1,804). Analysis: Negative binomial regression due to count DV and over-dispersion; models estimated without and with controls, and on full and matched samples; robust SEs.

Study 3 (Field—Amazon, search goods): Collected reviews from Amazon’s Beauty (women-typed) and Home-Improvement (men-typed) categories from 2014-01-01 to 2014-02-28. Initial N=15,948 (beauty=8,458; home-improvement=7,490). Gender inferred as in Study 2 (≥99% probability). Reviews with zero votes were dropped (as in prior work), yielding N=3,262 reviews (Beauty=1,759; Home-Improvement=1,503). Gender splits: total women=1,639; men=1,623; Beauty women=1,280, men=479; Home-Improvement women=359, men=1,144. DV helpfulness measured as proportion helpful votes/total votes. Controls included rating, title word count, review length, total votes, and time. Analysis: Binomial regression with logit link on full sample and split by product category; robust SEs and additional robustness checks reported in Supplementary Info.

Key Findings

Study 1 (Experiment): Reviews attributed to women were rated significantly lower in helpfulness than those attributed to men: mean_women=5.53 (SD=2.25) vs. mean_men=6.06 (SD=2.10), F(1,214)=5.14, p<0.05, ηp²=0.02. Likelihood to influence purchase was also lower for women-attributed reviews: mean_women=5.29 (SD=2.63) vs. mean_men=5.89 (SD=2.38), F(1,214)=4.23, p<0.05, ηp²=0.01. For male-typed product (tool kit): helpfulness higher for men vs. women, mean_men=6.98 (SD=1.89) vs. mean_women=6.39 (SD=1.71), F=5.78, p<0.05, ηp²=0.03; purchase likelihood marginally higher for men, mean_men=6.58 (SD=2.31) vs. mean_women=6.04 (SD=2.42), F=2.84, p=0.09. For female-typed product (baby care kit): helpfulness not significantly different, mean_men=5.40 (SD=2.35) vs. mean_women=4.88 (SD=2.32), F=2.72, p>0.1; purchase likelihood marginal, mean_men=5.52 (SD=2.39) vs. mean_women=4.93 (SD=2.65), F=3.01, p=0.08. Split by participant gender: Men showed no significant differences (helpfulness F=1.46, p=0.23; purchase F=0.24, p=0.62). Women rated women-attributed reviews lower than men-attributed for helpfulness (F=4.09, p<0.05, η²=0.03) and purchase likelihood (F=5.86, p<0.05, η²=0.04). Overall, no evidence of in-group bias; effects were driven by female participants.

Study 2 (Yelp): Women-authored reviews received significantly fewer “useful” votes. Coefficient on Women was negative and significant without controls (β=-0.2297, p<0.001), in matched sample (β=-0.1805, p<0.05), and with controls for full (β=-0.2040, p<0.001) and matched samples (β=-0.1506, p<0.01). On average, women-authored reviews received about 0.79 fewer useful votes than men-authored reviews.

Study 3 (Amazon): Helpfulness proportion (yes/total) regressions showed a negative effect of Women overall: without controls β=-0.3483 (p<0.001); with controls β=-0.3414 (p<0.001). By product category: Home-Improvement (men-typed) showed a significant negative Women coefficient (β=-0.4127, p<0.01); Beauty (women-typed) showed no significant difference (β=-0.1586, p>0.05). Controls behaved as expected (e.g., higher ratings and longer reviews associated with greater helpfulness). Results align with Study 1: men’s reviews are favored for male-typed products; no advantage for women in female-typed products.

Discussion

The findings directly address the research questions: across an experiment and two field datasets, women-attributed online product opinions are evaluated as less helpful and less influential than men’s, evidencing implicit gender bias in how opinions are weighed in buying decisions. The pattern extends to experience goods (Yelp) and search goods (Amazon). For gender-typed products, men’s reviews are favored for male-typed categories, while in female-typed categories women’s reviews are not rated higher than men’s, indicating that women’s opinions are not afforded greater weight even in domains stereotypically associated with women. The absence of in-group favoritism and the effect being more pronounced among female participants in Study 1 suggest complex internalized or societal bias mechanisms. Potential explanations discussed include persistent gender stereotypes about competence and analytic ability, subtle preexisting biases leading to discounting women’s opinions as more emotional, and underrepresentation of women in visible expert roles. These biases have practical implications for platforms and marketplaces where review visibility and rewards are tied to perceived value, potentially limiting women’s visibility and benefits.

Conclusion

Across three studies using experimental and archival methods, the paper documents implicit gender bias in evaluations of online product opinions: women’s reviews are rated as less helpful and less influential, with the effect robust across contexts and particularly evident for male-typed products; no preferential valuation of women’s reviews is observed for female-typed products. These results underscore that gender continues to predict how opinions are evaluated in buying decisions. The paper highlights implications for platforms that reward reviews based on perceived value and calls for efforts and interventions to break gender stereotypes to reduce such biases.

Limitations
  • Geographic and platform scope: All studies are based in the United States; Study 2 uses Yelp reviews from a single major Southeastern city’s nightlife category; Study 3 focuses on Amazon’s Beauty and Home-Improvement categories within a two-month window in 2014, which may limit generalizability across regions, time periods, and product domains.
  • Gender inference: Contributor gender was inferred from first names using a machine-learning tool with a ≥99% probability threshold; while validated on a subsample, this approach may exclude ambiguous names and reduce sample size (Study 2 retained 31.5% of reviews), potentially introducing selection effects.
  • Voting-based outcomes: Field study dependent variables rely on platform voting (useful/helpful), which can be influenced by unobserved platform dynamics, audience composition, and exposure biases; reviews with zero votes were excluded in Study 3, potentially biasing toward more-visible reviews.
  • Experimental stimuli: In Study 1, gender was signaled via avatars and first names, which, while pretested, may carry additional cues beyond gender; only three products (one neutral, one female-typed, one male-typed) were used, limiting product generality.
  • Data availability: Study 1 data are unavailable due to participant privacy, constraining external replication with the exact dataset.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny