logo
ResearchBunny Logo
The dilemma and countermeasures of educational data ethics in the age of intelligence

Education

The dilemma and countermeasures of educational data ethics in the age of intelligence

X. Guan, X. Feng, et al.

Discover groundbreaking insights on educational data ethics in intelligent education, analyzed by Xiu Guan, Xiang Feng, and A.Y.M. Atiquil Islam. This research uncovers critical problems like privacy violations and the need for learner-centered solutions that leverage cutting-edge technology like blockchain and 5G.... show more
Introduction

The paper addresses the rapid integration of AI, learning analytics, data mining, and cloud computing into education and the accompanying challenges of educational data ethics—how data are collected, managed, shared, and used safely and fairly. International bodies (UN, UNESCO, EDUCAUSE) and national frameworks (e.g., China’s Data Security Law and Personal Information Protection Law) underscore the urgency of robust data ethics. Yet, practical guidance tailored to educational contexts, especially for vulnerable learners, remains inadequate and culturally contingent. The study aims to clarify international research hotspots, trends, and dilemmas in educational data ethics and derive learner-centered strategies suitable for China. Research questions: (a) What are the dilemmas and solutions of educational data ethics in the international context? (b) What do these tell us about development of educational data ethics in China?

Literature Review

Existing definitions frame data ethics as principles for safe, fair data practices that avoid harm and maximize public interest (CDT, GSA, ODI). International initiatives (UN proposals; UNESCO AI ethics recommendations; EDUCAUSE security report) and national efforts (China’s ethical governance guidance, Data Security Law, Personal Information Protection Law, draft Internet Data Security regulations) recognize risks such as surveillance, privacy violations, and inequities. Prior reviews largely focus on specific technologies or settings—e.g., learning analytics ethics (privacy, consent, bias), AI in education challenges, big data ethics in social platforms—often yielding fragmented, one-sided solutions. There is a gap for a systematic, comprehensive synthesis to inform contextualized, learner-centered responses in China, considering cultural, political, and institutional specificities.

Methodology

Design: Bibliometric analysis complemented by in-depth reading of key literature to identify dilemmas and derive solution strategies. Data source and search: Web of Science Core Collection. Query combined themes for data/analytics/AI, ethics/privacy/fairness/governance, and education contexts, limited to 2019–2023 and literature types (theses, meetings, published online, review papers, books), following Hakimi et al. (2021) search formulation. Screening: 88,764 records retrieved (2019–2023); 3,096 duplicates removed (85,668 remaining). ASReview (active learning) used to prioritize relevance; screening stopped when 20 consecutive records were labeled not relevant. Final included: 385 papers. Tools and analyses: CiteSpace (6.1.R4) to visualize and quantify keyword co-occurrence networks (hotspots), timelines and bursts (evolution/trends), and collaboration networks (authors and institutions). Clustering assessed by size, silhouette, and average year; network metrics included frequency, degree, and betweenness centrality. Subsequent deep reading of salient works informed synthesis of dilemmas and strategies, with emphasis on applicability in China.

Key Findings

Hotspots (keyword co-occurrence clusters):

  • #0 Learning analytics (size 31; avg year 2020) – dominant hotspot linking many clusters; frequent keyword: learning analytics (Freq 76; Degree 44; Centrality 0.24).
  • #1 Data science (size 28; avg year 2020) – foundational methods and ethics literacy.
  • #2 Systematic review (size 28; avg year 2020) – indicates maturation of the field.
  • #3 Artificial intelligence (size 28; avg year 2020) – ethics in AI-enabled education (bias, fairness, transparency).
  • #4 Information (size 21; avg year 2020).
  • #5 Big data (size 19; avg year 2019) – big data ethics relevant to education; keyword Big data (Freq 41; Degree 56; Centrality 0.41).
  • #6 Artificial intelligence literacy (size 13; avg year 2020) – building literacy to mitigate ethical risks.
  • #7 Gender bias (size 13; silhouette 0.995; avg year 2022) – emerging focus on algorithmic gender bias.
  • #8 Attitude (size 10; avg year 2019).
  • #9 Educational data analytics (size 3; avg year 2021). Trend evolution (keyword bursts):
  • 2019–2020: Focus on information extraction and infrastructures—“information,” “big data analytics,” “architecture,” “academic library.”
  • 2020–2021: Emphasis on principles and settings—“privacy principle,” “university.”
  • 2021–2023: Practice-oriented themes—“systematic review,” “user acceptance,” “educational data mining,” “teacher,” “decision making,” signaling attention to classroom use, stakeholder acceptance, and decision-support. Influential scholars (past 5 years): Gasevic, Dragan (15); Tsai, Yi-Shan (11); Drachsler, Hendrik (7); Jones, Kyle ML (6); Prinsloo, Paul (5); others focusing on higher education, learning analytics ethics, and stakeholder perspectives. Influential institutions: Monash University (18) leads, followed by University of Edinburgh (9), The Open University (8), University of South Australia (6), King Abdulaziz University (6), MIT (5), University of Eastern Finland (5), Beijing Normal University (5), Indiana University (5), University of Michigan (5), Carnegie Mellon University (5). Dilemmas identified:
  • Privacy violations during data collection, storage, and sharing (informed consent issues; insufficient anonymization; re-identification risks across datasets).
  • Predictive analytics can constrain autonomy and agency of learners/teachers, induce filter bubbles, increase workload/skill demands on teachers, and introduce bias in decisions.
  • Data-centric evaluation with lack of a “forgetting” mechanism risks identity fixation, discrimination, chilling effects, and undermines developmental views of learners. Proposed strategies (learner-centered, China-focused):
  • Macro-level: Establish systematic educational data standards; build national/regional platforms (digital bases, data centers, education cloud) with strong governance, supervision, and 5G+ infrastructure.
  • Research–practice dual channel: Ethics training for data managers and vendors; privacy-by-design; transparent consent; empirical research on standards–utilization links; deploy privacy-preserving technologies (blockchain, federated learning, SMC, ZKP, ring signatures, homomorphic encryption, TEE) to address data silos and protect privacy.
  • Moderate application and forgetting: Ethics education for teachers/learners; balanced, non-reductive use of data in evaluation; develop algorithmic forgetting mechanisms; cultivate community norms to mitigate long-term harms.
Discussion

The bibliometric results show educational data ethics research clustered around learning analytics, AI, and big data, with a shift from infrastructure and principles to user-centric, practice-oriented concerns (acceptance, teachers’ roles, decision-making). These findings answer RQ1 by mapping dilemmas (privacy, autonomy, forgetting) and existing solution avenues (principles, governance, literacy, privacy-preserving tech). For RQ2, the synthesis informs a China-specific pathway: government-led standardization and platform-building to integrate disparate systems; a coordinated research–practice ecosystem to align policy, technology, and pedagogy; and learner-centered ethics education plus technical mechanisms (blockchain, federated learning) to ensure privacy, fairness, and explainability. Emphasizing moderate forgetting counters identity fixation and chilling effects, aligning data use with developmental, humane education. The approach advances actionable guidance for government agencies, institutions, teachers, and EdTech providers in China while remaining relevant internationally.

Conclusion

Educational data ethics is a critical constraint on intelligent education. Through bibliometric mapping (2019–2023) and deep reading, the study identifies three core dilemmas—privacy breaches across the data lifecycle, loss of autonomy due to predictive analytics, and data-centric evaluation without forgetting—and proposes three learner-centered strategies: (1) macro-level standard systems and platforms (data standards, data centers, digital bases, education cloud, 5G+); (2) a research–practice dual path to co-construct an educational data ecology with privacy-by-design and privacy-preserving technologies; (3) ethics education and moderate, forgetful application of data in evaluation. These contributions provide a foundation for policy and practice, especially in China. Future work should translate these recommendations into empirical implementations and evaluations to overcome ethical barriers in real educational settings.

Limitations

The study relies on bibliometric analysis and literature review without empirical intervention or case implementation. Consequently, conclusions and proposed strategies require validation through practical deployments and evaluative studies in diverse educational contexts.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny