logo
ResearchBunny Logo
Evaluating large language models in analysing classroom dialogue

Education

Evaluating large language models in analysing classroom dialogue

Y. Long, H. Luo, et al.

This groundbreaking study by Yun Long, Haifeng Luo, and Yu Zhang explores the remarkable capabilities of GPT-4 in analyzing classroom dialogues, revealing significant time savings and impressive consistency in coding. Discover how Large Language Models can revolutionize teaching evaluation!... show more
Abstract
This study explores the use of Large Language Models (LLMs), specifically GPT-4, in analysing classroom dialogue—a key task for teaching diagnosis and quality improvement. Traditional qualitative methods are both knowledge- and labour-intensive. This research investigates the potential of LLMs to streamline and enhance this process. Using datasets from middle school mathematics and Chinese classes, classroom dialogues were manually coded by experts and then analysed with a customised GPT-4 model. The study compares manual annotations with GPT-4 outputs to evaluate efficacy. Metrics include time efficiency, inter-coder agreement, and reliability between human coders and GPT-4. Results show significant time savings and high coding consistency between the model and human coders, with minor discrepancies. These findings highlight the strong potential of LLMs in teaching evaluation and facilitation.
Publisher
npj Science of Learning
Published On
Oct 03, 2024
Authors
Yun Long, Haifeng Luo, Yu Zhang
Tags
Large Language Models
GPT-4
classroom dialogue
qualitative methods
coding efficiency
education evaluation
inter-coder agreement
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny