logo
ResearchBunny Logo
Abstract or concrete? The effects of language style and service context on continuous usage intention for AI voice assistants

Business

Abstract or concrete? The effects of language style and service context on continuous usage intention for AI voice assistants

H. Lan, X. Tang, et al.

This fascinating research by Hai Lan, Xiaofei Tang, Yong Ye, and Huiqin Zhang explores how voice assistants can transform user experiences through tailored language styles. Discover how users gravitate towards abstract language for enjoyment and concrete language for practicality, all driven by processing fluency.

00:00
00:00
~3 min • Beginner • English
Introduction
The rapid maturation and commercialization of intelligent voice technology has led to widespread adoption of AI-powered voice assistants (VAs) such as Siri, Alexa, and local Chinese offerings (e.g., TmallGenie, Xiaoai). Despite growth, gaps remain between AI capabilities and human assistance, and repetitive, mechanical VA responses can frustrate users and reduce engagement. This study asks whether a simple, implementable change—altering the language style used by VAs—can improve user experience and continuous usage intention across different service contexts. Building on distinctions between utilitarian-dominant and hedonic-dominant contexts, and on research showing that users value accuracy and responsiveness in utilitarian contexts and anthropomorphism and affinity in hedonic contexts, the authors investigate whether matching VA language style (concrete vs. abstract) to service context (utilitarian vs. hedonic) produces a congruity effect that improves evaluations and continuous usage intention. Grounded in construal-level theory and prior language style research, the authors predict: H1a: In hedonic-dominant contexts, users prefer abstract language (higher continuous usage intention); H1b: In utilitarian-dominant contexts, users prefer concrete language (higher continuous usage intention). They further propose H2: Processing fluency mediates the congruity effect between VA language style and service context on continuous usage intention. Three studies (including a pilot to validate context classification) test these hypotheses using online experiments and a field experiment with real VA voice/visual stimuli.
Literature Review
The literature review situates VAs as increasingly central in daily life for both hedonic (e.g., entertainment) and utilitarian (e.g., information search, transactions) purposes, with adoption driven by perceived usefulness, playfulness, convenience, and personification, but hindered by privacy concerns and behavioral barriers. Prior work emphasizes VA communication qualities (variety, tailoring) that heighten social presence and engagement, and examines language styles along warmth and competence dimensions. Drawing on the linguistic category model (LCM), the review highlights the abstract–concrete language continuum: concrete language conveys verifiable, specific, experiential details; abstract language conveys generalized, higher-level, sometimes affective or figurative descriptions. Marketing research shows that the effectiveness of abstract versus concrete language depends on receiver knowledge, goals, and context, and that congruence between construal level and frames (e.g., temporal, speed, psychological distance) enhances persuasion and evaluation. In AI-human interactions, context (hedonic vs. utilitarian) moderates preferences for AI versus human recommenders and for robot appearance (warmth vs. competence). The authors argue that because concrete language maps onto competence cues and abstract language onto warmth/anthropomorphism and higher-level construals, matching VA language style to the hedonic–utilitarian nature of the service context should increase processing fluency and, in turn, improve continuous usage intention.
Methodology
Overview: A pilot study and three experiments tested the congruity effect between VA language style (abstract vs. concrete) and service context (hedonic-dominant vs. utilitarian-dominant) on continuous usage intention, and examined processing fluency as a mediator. Pilot Study: Purpose: Validate classification of four VA service contexts as hedonic-dominant or utilitarian-dominant. Design: 240 Credamo participants (40.8% male; Mage = 28.61) randomly assigned to music recommendation, movie recommendation, online shopping, or financial investment. Measure: Hedonic–utilitarian ratings (adapted from Liu et al., 2022; Voss et al., 2003). Results: Significant differences (F(3,236) = 149.75, p < 0.001). Music (M = 1.66) and movie (M = 0.47) were more hedonic than online shopping (M = −2.07) and financial investment (M = −2.45). Study 1: Design: 2 (VA language style: abstract vs. concrete) × 2 (service context: hedonic vs. utilitarian) between-subjects. N = 380 Credamo participants (56.3% male; Mage = 30.24). Contexts: Hedonic—music recommendation; Utilitarian—online shopping. Manipulations: Scenario prompts (Appendix A) and VA dialog with abstract vs. concrete responses (Appendix B). Checks: Perceived service context and perceived language style (single item, 1 = more abstract, 7 = more concrete; Packard & Berger, 2021). Measures: Continuous usage intention (3-item, 7-point; Bhattacherjee, 2001). Analyses: 2×2 ANOVA; simple effects. Study 2: Design: Same 2×2 structure with different contexts. N = 348 Credamo participants (50.9% male; Mage = 27.51). Contexts: Hedonic—movie recommendation; Utilitarian—financial investment. Measures: Continuous usage intention (as in Study 1); processing fluency (3 items, 7-point; e.g., ease of processing; Lee & Aaker, 2004). Analyses: 2×2 ANOVAs; moderated mediation with PROCESS Model 8 (Hayes, 2018), 5000 bootstraps. Coding: language (concrete = 0, abstract = 1), context (utilitarian = 0, hedonic = 1). Study 3: Field experiment with real VA stimuli. Design: 2×2 between-subjects. N = 165 university students in Chengdu; after attention check, N = 161 (42.9% male; Mage = 19.37). Contexts: Music recommendation (hedonic) vs. online shopping (utilitarian). Stimuli: Videos of VA appearance/voice engineered to be more robotic to reduce anthropomorphism (Appendix D). Procedure: Participants issued verbal commands, then viewed pre-recorded VA responses in abstract vs. concrete language (Appendix B). Measures: Continuous usage intention; processing fluency (as in Study 2); alternative explanations (perceived accuracy, usefulness; adapted from Yuan et al., 2022; Davis, 1989). Analyses: 2×2 ANOVAs; moderated mediation (PROCESS Model 8, 5000 bootstraps); tests of alternative mediators.
Key Findings
Pilot: Validated context classification; music and movie recommendations are hedonic-dominant; online shopping and financial investment are utilitarian-dominant (F(3,236) = 149.75, p < 0.001). Study 1 (music vs. shopping): Manipulation checks passed (hedonic vs. utilitarian: Mhedonic = 5.12 vs. Mutilitarian = 1.92, t(323) = 19.55, p < 0.001; language: Mconcrete = 6.25 vs. Mabstract = 4.10, t(251) = 13.24, p < 0.001). Continuous usage intention showed a significant interaction between language style and context (F(1,376) = 26.49, p < 0.001) and a main effect of language (F(1,376) = 3.94, p < 0.05). Simple effects: Utilitarian: concrete > abstract (M = 5.81 vs. 5.01; F(1,376) = 25.42, p < 0.001). Hedonic: abstract > concrete (M = 5.56 vs. 5.20; F(1,376) = 5.00, p < 0.05). Study 2 (movie vs. investment): Manipulations successful (context t(313) = 14.57, p < 0.001; language t(286) = 10.64, p < 0.001). Processing fluency: significant interaction only (F(1,344) = 19.83, p < 0.001); utilitarian: concrete > abstract (M = 5.83 vs. 5.27; F = 11.19, p < 0.01); hedonic: abstract > concrete (M = 5.78 vs. 5.27; F = 8.73, p < 0.01). Continuous usage intention: interaction (F(1,344) = 20.97, p < 0.001) and main effect of context (F(1,344) = 5.11, p < 0.05; Mhedonic = 5.54 > Mutilitarian = 5.28). Simple effects: Utilitarian: concrete > abstract (M = 5.53 vs. 5.03; F = 9.68, p < 0.001); Hedonic: abstract > concrete (M = 5.81 vs. 5.26; F = 11.31, p < 0.01). Moderated mediation (PROCESS Model 8): Index = 0.71, SE = 0.19, 95% CI [0.3688, 1.1008]; conditional indirects: hedonic β = 0.34, 95% CI [0.1164, 0.5855]; utilitarian β = −0.37, 95% CI [−0.6357, −0.1401]. Study 3 (field, music vs. shopping with real VA stimuli): Manipulation checks passed (context t(159) = 13.76, p < 0.001; language t(152) = 4.74, p < 0.001). Processing fluency: interaction (F(1,157) = 43.45, p < 0.001) and main effect of context (F(1,157) = 21.98, p < 0.05); utilitarian: concrete > abstract (M = 5.20 vs. 4.25; F = 17.12, p < 0.001); hedonic: abstract > concrete (M = 6.04 vs. 4.90; F = 27.23, p < 0.001). Continuous usage intention: interaction (F(1,157) = 75.27, p < 0.001) and main effect of context (F(1,157) = 26.81, p < 0.001); utilitarian: concrete > abstract (M = 5.04 vs. 3.95; F = 31.51, p < 0.001); hedonic: abstract > concrete (M = 5.80 vs. 4.57; F = 44.77, p < 0.01). Moderated mediation: Index = 0.76, SE = 0.21, 95% CI [0.37, 1.24]; conditional indirects: hedonic β = 0.42, 95% CI [0.20, 0.67]; utilitarian β = −0.34, 95% CI [−0.62, −0.12]. Alternative explanations: No condition differences in perceived response accuracy (Mconcrete = 5.13 vs. Mabstract = 5.10, t(159) = 0.19, p = 0.48). Neither perceived accuracy nor usefulness mediated the language × context effect on continuous usage intention (accuracy indirect effect 95% CI includes 0; usefulness indirect effect 95% CI includes 0). Overall: Across studies, a robust congruity effect emerges: abstract language is preferred in hedonic-dominant contexts, concrete language in utilitarian-dominant contexts; processing fluency mediates this effect.
Discussion
The findings support the central hypothesis that matching VA language style to the hedonic or utilitarian nature of the service context enhances users’ continuous usage intention. This congruity increases processing fluency, which serves as the mechanism linking fit to favorable evaluations. By demonstrating the effect across multiple scenarios and in a field setting with real VA audio-visual stimuli, the results extend AI-human interaction research beyond broad warmth/competence or social/task-oriented language frames to a manipulable abstract–concrete continuum grounded in construal-level theory. The work clarifies that users facing utilitarian tasks value concrete, specific, numeric, and action-oriented VA responses (competence/low-level construal), whereas users engaging in hedonic tasks respond better to abstract, high-level, and affectively enriched language (warmth/high-level construal). The absence of mediation by perceived accuracy or usefulness and the robustness of the moderated mediation via processing fluency underscore the psychological-fit mechanism rather than simple performance perceptions. Practically, tailoring VA language dynamically to context can improve user experience and encourage continued use, offering a low-cost strategy to enhance engagement with current AI capabilities.
Conclusion
This research shows a consistent mindset-congruency effect in AI voice assistant interactions: abstract language enhances evaluations and continuous usage intention in hedonic-dominant contexts, while concrete language does so in utilitarian-dominant contexts. Processing fluency mediates this effect. Contributions include extending AI-human interaction literature with a construal-level language strategy, integrating service context characteristics with language style, and identifying processing fluency as the underlying mechanism. For practitioners, designing VA systems to detect or infer service context and dynamically switch between abstract and concrete language packages can enhance user engagement. Future research should test these effects with physical VAs in real-world settings, explore additional language styles (e.g., humor types), examine different voice characteristics (e.g., gender, celebrity), validate across broader linguistic/cultural contexts beyond China, and assess moderators such as social class or preferences for human versus AI recommenders.
Limitations
The experiments primarily used nonphysical VAs (textual scenarios and videos), which may limit ecological validity relative to real, embodied interactions; future work should test with physical VAs in situ. The studies were conducted in the Chinese context, limiting generalizability across languages and cultures. Only abstract vs. concrete language styles were examined; other styles (e.g., humor types) and voice attributes (gender, celebrity) were not tested. The congruity effect may vary by user characteristics (e.g., social class differences in efficiency vs. fun orientation). Although alternative explanations (perceived accuracy, usefulness) were tested and ruled out as mediators, other unmeasured factors could contribute.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny