logo
Loading...
Understanding customer experience with Vietnamese hotels by analyzing online reviews

Business

Understanding customer experience with Vietnamese hotels by analyzing online reviews

H. T. T. Nguyen and T. X. Nguyen

In an insightful analysis of over 20,000 online reviews from TripAdvisor, researchers Ha Thi Thu Nguyen and Trung Xuan Nguyen uncover how Vietnamese hotel customers perceive service quality. The study reveals that while the 'place' aspect garners high satisfaction, the 'room' aspect falls short. This research emphasizes the importance of understanding customer sentiments for hotel managers aiming to enhance service offerings.... show more
Introduction

Customer experience has been a central focus for over two decades across multiple perspectives. Managing it effectively requires understanding customer emotions and implementing strategies that enhance experiences. In Vietnam, customer experience is a core pillar of digital transformation. The growth of the Internet and online booking sites has made data from customer-generated reviews readily available, enabling data-driven decision-making in hospitality. However, existing studies lack clear formulas and processes for measuring customer experience, and Vietnamese hotel managers often rely on interviews or questionnaires that are hard to scale. This study asks: Can standard formulas combined with lexical rules be developed to measure customer satisfaction based on data analysis? The authors introduce definitions, formulas, and algorithms to measure overall and aspect-level satisfaction using NLP on TripAdvisor reviews of Vietnamese hotels.

Literature Review

The review covers NLP and sentiment analysis applications for understanding customer opinions, emotions, and attitudes in text, drawing on methods such as topic modeling (LDA/STM), lexicon-based sentiment scoring, and text-mining techniques. Prior hospitality studies transform unstructured review text into quantitative measures to assess satisfaction and identify salient aspects (e.g., room, staff, location, cleanliness, food). Many works extract frequent words and topics or use readability/subjectivity metrics to predict ratings. However, gaps include limited use of lexical rules for noun–polarity combinations, lack of standardized formulas, small samples in some studies, and insufficient handling of how adjectives/adverbs interact with aspect nouns. Python’s NLP ecosystem (NLTK, VADER) is highlighted for practical text processing, with VADER suitable for social-media-like text without training data.

Methodology

Data: 20,551 TripAdvisor reviews for 12 Vietnamese hotels across Hanoi, Ho Chi Minh City, Danang, Hue, Quy Nhon, and Nha Trang (3–5 star), collected via WebHarvy and stored as CSV. Processed dataset contains 2,268,646 words and a vocabulary of 32,687; review lengths range from 1 to 10,335 (units as reported). Tools: Python with NLP libraries (NLTK, VADER). Fundamental definitions: (1) Set of experienced guests G = {g_i} with reviews r_i; (2) Satisfaction Sas(g_i) computed from VADER sentiment scores (neg, pos, neu) via a combination function λ; (3) Aspects A = {a_k} treated as nouns grouped into predefined aspect sets (e.g., Location, Room, Staff, Services, Meal, Surroundings, Value) with synonym lists (Table 2). Process: Step 1 data collection; Step 2 storage and preprocessing (cleaning, tokenization, stop-word removal where needed); Step 3 analysis. Overall satisfaction: Use VADER to compute sentiment for each review; classify reviews as positive/negative/neutral via compound score; compute overall satisfaction rate as number of positive reviews divided by total reviews. Aspect extraction and satisfaction: Identify occurrences of aspect nouns using predefined aspect groups; compute Ext(A) as aspect frequencies across reviews; for each aspect a_i, compute Sas(a_i) = positive reviews mentioning a_i divided by total reviews mentioning a_i. Polarity–aspect pairing: For deeper analysis, segment positive reviews into sentences; for each sentence containing an aspect, locate the aspect index; search up to four tokens to the left and right for the first polarity word (adjectives/adverbs) from a curated list (e.g., good, great, nice, excellent, friendly, clean, helpful, beautiful); record co-occurrence frequencies per aspect–polarity pair. Negative/complaint analysis: Similarly evaluate dissatisfaction rates per aspect by considering negative reviews and computing the ratio of negative reviews mentioning each aspect to total reviews mentioning that aspect. Outputs include overall satisfaction percentages, aspect-level satisfaction rates, aspect frequencies, and polarity–aspect co-occurrence tables.

Key Findings
  • Dataset: 20,551 reviews; 2,268,646 words; vocabulary 32,687; review length min 1, max 10,335 (as reported). - Overall sentiment classification (Table 6): Satisfaction 19,504 reviews (94.9%), Dissatisfaction 976 (4.75%), Neutral 71 (0.35%). - Correlation: Positive association between review length and satisfaction (reported correlation ~0.124). - Aspect interest/frequency: “Room” mentioned 15,362 times; “Staff” 11,937 times in positive reviews; top aspects include room, staff, service, place, restaurant, food. - Aspect-level satisfaction rates (Fig. 6): Place 78.0% (highest), Bathroom 73.2%, Service 72.6%, Restaurant 68.7%, Food 63.5%, Staff 64.1%, Room 61.3% (lowest). - Polarity–aspect pairings (examples, Table 8): Room co-occurs with clean (749), comfortable (371), large (323), good (428), great (300), beautiful (223); Staff with helpful (902), friendly (613), good (315), attentive (210), nice (288); Food with good (584), great (398), excellent (290), delicious (194); Restaurant with nice (264), great (228), excellent (183), good (192). - Common complaint aspects (negatives): room, staff (~30.3% noted for staff in complaints), and price among the highest dissatisfaction mentions.
Discussion

The study demonstrates that standardized formulas combined with lexical rules and VADER-based sentiment can measure customer satisfaction at both overall and aspect levels from large-scale online reviews, addressing the research question. The very high overall satisfaction rate (94.9%) contrasts with lower aspect-specific satisfaction, revealing critical improvement areas. The highest satisfaction for “place” (78%) suggests location advantages, while the lowest for “room” (61.3%) pinpoints the most pressing service gap. Polarity–aspect pairings provide interpretable signals for operational focus (e.g., cleanliness and comfort drive positive room evaluations). The positive relationship between review length and satisfaction offers an additional behavioral indicator. For managers, these results enable prioritization of upgrades (notably room quality) and attention to price-related concerns, informing targeted quality improvements and customer experience strategies.

Conclusion

This paper presents a clear, replicable method for analyzing online hotel reviews using NLP, VADER sentiment scoring, and lexical rules to quantify overall and aspect-level customer satisfaction. Applying the method to 20,551 TripAdvisor reviews of Vietnamese hotels shows overall satisfaction of 94.9%, with aspect disparities: location (place) performs best and room performs worst. The approach yields actionable insights for hotel managers to prioritize improvements, particularly in room quality and price perceptions. Future work will deepen linguistic rule use, expand negative polarity analysis to better characterize complaints, and refine multi-level satisfaction measurement (e.g., finer-grained scales) to mitigate semantic ambiguity and improve accuracy.

Limitations
  • The analysis focuses primarily on 3–5 star hotels and 12 properties, which may limit generalizability across all hotel segments. - Aspect satisfaction relies on predefined aspect groups and lexical rules; unlisted synonyms or complex linguistic structures may be under-captured. - Negative/aspect dissatisfaction analysis is acknowledged as not yet in-depth; future work is planned to expand negative polarity and linguistic rule analysis. - Reliance on TripAdvisor reviews may introduce platform and self-selection biases.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny