logo
ResearchBunny Logo
Deep learning-based multi-criteria recommender system for technology-enhanced learning

Computer Science

Deep learning-based multi-criteria recommender system for technology-enhanced learning

L. Salau, H. Mohamed, et al.

A hybrid DeepFM-SVD++ model fuses factorization machines and deep neural networks to tackle sparsity, over-specialization, and cold-start in multi-criteria recommender systems, improving personalization in Technology-Enhanced Learning and beyond. This research was conducted by Latifat Salau, Hamada Mohamed, Yunusa Simpa Abdulsalam, and Hassan Mohammed.

00:00
00:00
~3 min • Beginner • English
Introduction
The paper addresses the need for improved personalization in recommendation systems by modeling multi-dimensional user preferences, particularly in Technology-Enhanced Learning (TEL). Traditional recommender systems (RSs), initially focused on single-criterion ratings, evolved through collaborative filtering and matrix factorization but still face data sparsity, cold-start, and over-specialization issues. Multi-Criteria Recommender Systems (MCRSs) offer richer personalization by considering multiple aspects per item; however, their use in TEL is limited and often relies on matrix factorization alone, inadequately capturing complex non-linear interactions among criteria. The study proposes a hybrid DeepFM-SVD++ approach to jointly learn low-order (factorization) and high-order (deep neural) interactions across multi-criteria data, aiming to improve prediction accuracy, ranking performance, and robustness in TEL settings and beyond.
Literature Review
The related work is organized into two areas: (1) Deep learning in MCRSs: Prior studies demonstrate that deep architectures can learn non-linear aggregation functions and improve accuracy in multi-criteria contexts (e.g., deep neural aggregation for context-aware MCRS; multi-criterion ranking with neural matrix factorization; deep autoencoder-based multi-criteria recommendations; fusion of deep neural networks with matrix factorization). These works show consistent gains over baselines but have limited focus on TEL. (2) MCRSs in education: Existing TEL approaches often rely on single-criterion RSs or traditional collaborative filtering for multi-criteria datasets, addressing educational contexts like learning path recommendations, context-aware learning, and clustering-based personalization. While promising, these approaches struggle to capture complex feature interactions and face scalability and sparsity challenges. The gap identified is the lack of deep learning-based MCRSs tailored to TEL that jointly model multi-criteria dependencies.
Methodology
The study introduces a hybrid DeepFM-SVD++ framework to learn an adaptive aggregation function for multi-criteria ratings and to improve overall recommendation accuracy. Preliminary: SVD models users and items in latent spaces, predicting ratings via user/item biases and latent factor dot products; SVD++ extends SVD by incorporating implicit feedback through additional item factors, improving accuracy in sparse settings. DeepFM combines Factorization Machines (FM) for explicit low-order pairwise feature interactions with a Deep Neural Network (DNN/MLP) for high-order non-linear dependencies; both components share embeddings and their outputs are summed and passed through a sigmoid for final prediction. Proposed framework and phases: (1) Decompose the n-criteria rating problem into n single-criterion prediction tasks and estimate missing criterion ratings using SVD or SVD++. (2) Train DeepFM as an aggregation function on known overall ratings, treating overall rating as another criterion. (3) Compute overall ratings for unrated items using predicted individual criteria and the learned aggregation function. (4) Generate Top-N recommendations based on highest predicted overall ratings. Datasets: ITM-Rec (TEL domain) with 5,230 ratings from 476 users over 70 items; criteria include App (c₁), Data (c₂), Ease (c₃), and Overall (c₄). Data were collected via questionnaires (2017–2022) across courses (DA, DS, DB). To ensure unique interactions, a UID was constructed as UserID-ItemID-Class; data cleaning addressed inconsistencies. Average ratings: App 3.421, Data 3.390, Ease 3.177, Overall 3.374; Pearson correlations with Overall: App 0.799, Data 0.725, Ease 0.582. Yahoo Movies (entertainment domain) with 62,156 ratings from 6,078 users over 976 movies (sparsity 99.0%); criteria: Direction (cr1), Action (cr2), Story (cr3), Visuals (cr4), Overall (cr0). Letter grades (A+–F) were mapped to a numeric scale (1–13). Implementation details: DeepFM built in Python using TensorFlow/Keras/Scikit-learn; embedding dimension 10; L2 regularization 0.0001; Dense layer of 128 units; Adam optimizer with learning rate 0.01; batch size 32; up to 45 epochs; callbacks ReduceLROnPlateau and EarlyStopping; sigmoid output scaled to the dataset rating range via a lambda layer; user and item biases modeled as separate embedding layers. SVD and SVD++ implemented with the Surprise library: Reader rating scale 1–5; hyperparameter tuning via GridSearchCV with n_epochs ∈ {20, 30, 40}, lr_all ∈ {0.005, 0.01, 0.05}, reg_all ∈ {0.06, 0.1, 0.2, 0.5}; best estimators trained and evaluated on test folds. Evaluation protocol and metrics: K-fold cross-validation (K=5 and 10) with Top-N=10 and 20; metrics include MAE, RMSE for rating accuracy; Precision, Recall, F1-score, MAP for usage prediction; AUC and FCP for ranking accuracy.
Key Findings
On ITM-Rec (TEL): Across four configurations (5/10 folds × Top-10/Top-20), DeepFM_SVD++_MCRS and DeepFM_SVD_MCRS outperform SVD++ and SVD on all metrics. Averaged results (Table 4): DeepFM_SVD++_MCRS RMSE 0.8950 vs SVD++ 1.2876 (difference 0.3926; ~30% improvement); MAE 0.7243 vs 1.0525 (0.3281; ~31%); Precision 0.7740 vs 0.6930 (+0.0811; ~12%); Recall 0.9550 vs 0.8472 (+0.1078; ~13%); F1 0.8549 vs 0.7624 (+0.0926; ~12%); MAP 0.9243 vs 0.7819 (+0.1425; ~18%); AUC 0.9159 vs 0.7445 (+0.1714; ~23%); FCP 0.8472 vs 0.6891 (+0.1581; ~23%). DeepFM_SVD_MCRS shows similar gains over SVD (e.g., RMSE 0.9146 vs 1.2906; ~29% improvement; MAE 0.7358 vs 1.0363; ~30%). Correlation analysis: DeepFM_SVD++_MCRS correlation 0.777 and DeepFM_SVD_MCRS 0.769 between actual and predicted ratings; SVD++ 0.458 and SVD 0.465, indicating substantially better predictive alignment for the proposed models. On Yahoo Movies (entertainment): Averaged results (Table 8) show DeepFM_SVD++_MCRS RMSE 1.6513 vs SVD++ 2.7001 (~39% improvement), MAE 1.2086 vs 1.9973 (~39%); Precision 0.8795 vs 0.8007 (~10%); Recall 0.9646 vs 0.8789 (~10%); F1 0.9201 vs 0.8380 (~10%); MAP 0.9746 vs 0.8890 (~10%); AUC 0.9597 vs 0.8306 (~16%); FCP 0.8964 vs 0.7475 (~20%). DeepFM_SVD_MCRS shows comparable improvements over SVD. Correlations: DeepFM approaches achieve 0.912 and 0.908 vs SVD/SVD++ at 0.623, confirming stronger alignment with actual ratings. Comparative impact with related works indicates larger percentage improvements across metrics for the proposed models (Tables 7 and 11).
Discussion
The findings demonstrate that integrating DeepFM with SVD/SVD++ effectively models both low-order and high-order interactions among multi-criteria ratings, addressing key TEL challenges like sparsity, over-specialization, and cold-start. The hybrid architecture’s shared embeddings and non-linear aggregation capabilities result in lower predictive errors (RMSE/MAE), improved classification metrics (Precision/Recall/F1), and stronger ranking performance (MAP/AUC/FCP), translating to more personalized and reliable Top-N recommendations. High correlation coefficients indicate that the learned aggregation function closely captures user preferences. Results on Yahoo Movies confirm generalizability beyond TEL, showing that the approach robustly handles highly sparse data (≈99% sparsity) and diverse criteria structures, suggesting broad applicability across domains where multi-criteria preferences are critical.
Conclusion
The study presents a DeepFM-SVD++ aggregation-function-based MCRS tailored for TEL, which consistently outperforms traditional SVD and SVD++ across rating accuracy, ranking quality, and correlation metrics on ITM-Rec and Yahoo Movies datasets. Contributions include: (1) advancing multi-criteria recommendations in TEL through a hybrid deep learning and factorization approach, (2) achieving significant accuracy and ranking improvements in sparse settings, and (3) demonstrating cross-domain generalization. Future work will extend evaluations to large-scale MOOC platforms (Coursera, edX, Udemy), incorporate context-aware mechanisms and advanced architectures (DCN, DIN, xDeepFM) for richer feature interaction modeling, and conduct ablation studies to quantify component-wise contributions.
Limitations
Direct comparison with prior studies is constrained by differences in datasets and model architectures, as many TEL works rely on single-rating techniques. The evaluation is limited to two datasets (ITM-Rec and Yahoo Movies), without reported ablation analyses to isolate component contributions. Context-aware factors were not integrated, and additional large-scale TEL datasets (e.g., MOOCs) were not explored, leaving scalability and robustness in broader educational settings for future investigation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny