Education

Evaluating large language models in analysing classroom dialogue

Y. Long, H. Luo, et al.

This groundbreaking study by Yun Long, Haifeng Luo, and Yu Zhang explores the remarkable capabilities of GPT-4 in analyzing classroom dialogues, revealing significant time savings and impressive consistency in coding. Discover how Large Language Models can revolutionize teaching evaluation!

00:00

Playback language: English

Index

Abstract

This study investigates the potential of Large Language Models (LLMs), specifically GPT-4, in analyzing classroom dialogue. Traditional qualitative methods are time-consuming and labor-intensive. The researchers compared manual coding of classroom dialogues from middle school math and Chinese classes with GPT-4 outputs, evaluating time efficiency, inter-coder agreement, and reliability. Results showed significant time savings and high coding consistency between the model and human coders, with minor discrepancies, highlighting LLMs' potential in teaching evaluation.

Publisher

npj Science of Learning

Published On

Oct 03, 2024

Authors

Yun Long, Haifeng Luo, Yu Zhang