logo
ResearchBunny Logo
Abstract
Large language models (LLMs) have shown promise in text classification, but their effectiveness depends on the availability of large labeled datasets. This paper reviews the potential and limitations of using LLMs for text classification through synthetic data generation. It explores methodologies like data augmentation, adversarial training, and transfer learning to address data scarcity and domain adaptation. The paper examines the effectiveness of these approaches in enhancing classification performance, discusses challenges (data privacy, bias amplification, model fairness), and analyzes the impact of model size, pretraining data, and fine-tuning strategies. Finally, it identifies key research gaps and proposes future directions, including developing better evaluation metrics for synthetic data and investigating long-term societal impacts.
Publisher
International Research Journal of Engineering & Applied Sciences (IRJEAS)
Published On
Apr 01, 2024
Authors
Ashok Kumar Pamidi Venkata, Leeladhar Gudala
Tags
large language models
text classification
synthetic data generation
data scarcity
adversarial training
transfer learning
model fairness
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny