logo
ResearchBunny Logo
BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English

Computer Science

BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English

D. Srirag, A. Joshi, et al.

BESSTIE introduces the first labelled benchmark for sentiment and sarcasm across three English varieties (en-AU, en-IN, en-UK), built from Google Places reviews and Reddit comments with manual and automatic validation. Nine large language models were fine-tuned and evaluated, revealing consistent advantages on inner-circle varieties (en-AU, en-UK) and challenges in cross-variety generalisation—especially for sarcasm. Research conducted by Dipankar Srirag, Aditya Joshi, Jordan Painter, and Diptesh Kanojia. Dataset available on Hugging Face.... show more
Abstract
Despite large language models (LLMs) being known to exhibit bias against non-standard language varieties, there are no known labelled datasets for sentiment analysis of English. To address this gap, we introduce BESSTIE, a benchmark for sentiment and sarcasm classification for three varieties of English: Australian (en-AU), Indian (en-IN), and British (en-UK). We collect datasets for these language varieties using two methods: location-based for Google Places reviews, and topic-based filtering for Reddit comments. To assess whether the dataset accurately represents these varieties, we conduct two validation steps: (a) manual annotation of language varieties and (b) automatic language variety prediction. Native speakers of the language varieties manually annotate the datasets with sentiment and sarcasm labels. We perform an additional annotation exercise to validate the reliance of the annotated labels. Subsequently, we fine-tune nine large language models (LLMs) (representing a range of encoder/decoder and mono/multilingual models) on these datasets, and evaluate their performance on the two tasks. Our results show that the models consistently perform better on inner-circle varieties (i.e., en-AU and en-UK), in comparison with en-IN, particularly for sarcasm classification. We also report challenges in cross-variety generalisation, highlighting the need for language variety-specific datasets such as ours. BESSTIE promises to be a useful evaluative benchmark for future research in equitable LLMs, specifically in terms of language varieties. The BESSTIE dataset is publicly available at: https://huggingface.co/datasets/unswn1porg/BESSTIE.
Publisher
Findings of the Association for Computational Linguistics: ACL 2025
Published On
Jul 27, 2025
Authors
Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia
Tags
BESSTIE
language varieties
sentiment analysis
sarcasm detection
large language models
cross-variety generalisation
English (en-AU, en-IN, en-UK)
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny