Education
Research on the development of principles for designing elementary English speaking lessons using artificial intelligence chatbots
J. Han and D. Lee
Amid rapid advances in AI, big data, and IoT, schools are increasingly integrating educational technologies. In South Korea’s EFL context, elementary students have limited exposure to English (2–3 hours per week; average class size ~23), making it difficult to provide sufficient speaking practice and individualized feedback. Wide proficiency gaps and uniform assignments further hinder engagement: proficient students may be under-challenged, while struggling students may avoid participation. AI chatbots, enabled by machine learning and deep learning, are viewed as promising tools to expand speaking opportunities, deliver timely feedback, and personalize learning. However, research on chatbot-assisted lesson design, teacher roles, and especially applications for elementary learners remains scarce. This study aims to develop and validate principles for designing elementary English speaking lessons using AI chatbots to guide teachers in effectively incorporating chatbots so students can meet cognitive and affective goals. Research questions: (1) What are the principles for designing elementary English speaking lessons using AI chatbots? (2) Are the principles valid?
Theoretical background emphasized the evolution of South Korea’s elementary English curriculum toward communicative competence and the relevance of leveraging ICT in line with the 2022 curriculum revision. Within communicative language teaching, task-based learning (TBL) is highlighted as a suitable approach for designing chatbot-supported speaking tasks that promote interaction and natural language use. Studies since 2019 report positive cognitive and affective effects of AI chatbots in EFL: increased exposure and opportunities for use, improved fluency through immediate feedback, and authentic language enhancing conversational skills. Proficiency differences affect interactions; advanced learners engage more and report higher satisfaction, while beginners may discontinue without support. Teacher scaffolding is beneficial for novices but can impede advanced learners, underscoring the need for level-sensitive design. Chatbots can reduce speaking anxiety and raise engagement, but motivation may decline over time, requiring task-driven designs. Design considerations distilled from prior work include: (a) media selection suited to learner levels and enabling meaningful interaction (e.g., selecting appropriate chatbot builders); (b) content/task design aligned to proficiency, coherence across units, and learners’ contexts, with systematic, accessible tasks; (c) optimized learning environments and technical infrastructure (noise reduction, device/app readiness, contingency planning); (d) explicit usage guidance and task instructions; (e) pre-learning of vocabulary/sentences to reduce cognitive load; (f) strategies to spark and sustain interest (quizzes, graphics, animations); (g) tailored scaffolding and cues to support meaning negotiation; (h) immediate individualized feedback (by chatbot and/or teacher); and (i) learning management enabling reflection, review, and progress tracking.
Design and Development research (model research type) was employed to create and validate instructional design principles for elementary English speaking classes using AI chatbots. Procedures: (1) Model development via comprehensive literature review of AI chatbots, English speaking instruction, and design principles/models to derive initial principles and components. (2) Expert validation: Two rounds of reviews by five experts (educational technology, AI-based English education, and elementary English education specialists; profiles included professors and experienced teachers with Ph.D./Ed.D.). A 4-point Likert questionnaire (4=strongly agree to 1=strongly disagree) with open-ended comments assessed validity, clarity/explanatory power, usefulness, universality, and comprehensibility. Content Validity Index (CVI) and Inter-Rater Agreement (IRA) were computed. (3) Usability evaluation: Three elementary school teachers (7–20 years’ experience) with interest/experience in AI chatbots conducted one-on-one discussions with researchers, designed lessons based on the principles, and completed a 4-point usability questionnaire on helpfulness for lesson planning and clarity. Feedback on strengths/weaknesses and improvement areas was collected. Analyses of CVI and IRA informed refinements. Outputs evolved through initial, second, and third iterations, culminating in final principles and guidelines.
- Initial model components (from literature): AI chatbot learning tool, AI chatbot utilization curriculum, AI chatbot learning support, AI chatbot utilization activities, and AI chatbot learning outcomes and evaluation.
- Expert validation of components: Round 1 means ranged 3.00–3.60; IRA=0.11 indicated need for revisions. Round 2 for revised components achieved CVI=1.00 and IRA=1.00, indicating full agreement and validity.
- Expert validation of overall design principles: Round 1 category means ≥3.60; CVI≥0.80; IRA=0.80. Experts requested clearer explanations/examples and better differentiation. Round 2, after revisions, all categories scored 4.00 with CVI=1.00 and IRA=1.00.
- Restructuring yielded four final components: Creating AI Chatbot Learning Environment; AI Chatbot Utilization Curriculum; AI Chatbot Teaching and Learning Activities; Evaluation of AI Chatbot Learning.
- Final outputs: 10 principles and 24 detailed guidelines. The principles are: (1) Media selection; (2) Creating a learning environment; (3) Content restructuring; (4) Stimulating and sustaining interest and motivation; (5) Providing guidance; (6) Scaffolded learning support; (7) Individualized feedback provision; (8) Fostering a learning environment that supports growth and development; (9) Communication and collaboration; (10) Learning management. Detailed guidelines include actionable measures (e.g., selecting user-friendly media such as Dialogflow; ensuring Wi-Fi/headsets; noise mitigation; pre-instruction on chatbot use; task design by proficiency; cues for meaning negotiation; immediate feedback; reflection artifacts; LMS-based tracking).
- Usability evaluation (n=3 teachers): Mean=4.00 for both helpfulness and clarity items; CVI=1.00; IRA=1.00. Teachers found principles useful with clear examples, requesting even more concrete, teacher-friendly exemplars and tabular presentation.
The study provides a systematic, validated set of instructional design principles and guidelines enabling teachers to design elementary English speaking lessons with AI chatbots, moving beyond prior work focused solely on measuring effects or offering generic models. By grounding the model in theory (CLT/TBL) and extensive literature on chatbot affordances and challenges, and by validating through expert panels and field usability, the principles address practical classroom constraints (limited hours, large classes, proficiency gaps) and support personalization, scaffolding, and ongoing motivation. The approach can generalize beyond English to other languages and to secondary and higher education with appropriate adaptation to learner proficiency and curricular standards. In the South Korean context with increasing Edu-tech adoption post-COVID-19, these principles can reduce teacher trial-and-error and guide systematic implementation. The model also suggests potential to mitigate inequities by extending speaking practice beyond classroom time, provided access to devices and connectivity is ensured.
The research developed and validated a structured set of 10 instructional design principles and 24 guidelines for elementary English speaking lessons using AI chatbots. Instruction centers on AI chatbot teaching and learning activities, concludes with reflection and evaluation, and requires alignment with available tools (e.g., Dialogflow) and infrastructure. The principles enable personalized instruction (e.g., repetitive vs. Q&A chatbot modes) and may help address constraints of limited instructional time and large classes; chatbot-mediated practice could complement or partially substitute native-speaker support. Broader impact includes potential reduction of proficiency gaps through extended practice opportunities. Future work should adapt and evaluate the model at middle/high school and university levels, address the time/effort and infrastructure demands on teachers and schools, and expand design models to cover listening, reading, and writing to support all four skills.
- Scope limited to elementary school context in South Korea; generalizability to other levels and contexts requires further study and adaptation.
- Implementation requires substantial teacher time/effort and skills in tools like Dialogflow, as well as development of chatbot content and materials.
- Significant infrastructure prerequisites (devices for each student, reliable Wi-Fi, app configuration) may burden teachers and schools; institutional support is necessary.
- The current principles focus on speaking; additional models are needed for listening, reading, and writing.
Related Publications
Explore these studies to deepen your understanding of the subject.

