Medicine and Health
Towards fairness-aware and privacy-preserving enhanced collaborative learning for healthcare
F. Zhang, D. Zhai, et al.
Federated Learning can harness distributed patient data while preserving privacy, but disparities in computing resources risk unequal AI outcomes. We introduce a resource-adaptive collaborative learning framework that dynamically matches varying institutional capacities to improve model accuracy and fairness. This research was conducted by Feilong Zhang, Deming Zhai, Guo Bai, Junjun Jiang, Qixiang Ye, Xiangyang Ji, and Xianming Liu.
~3 min • Beginner • English
Introduction
AI’s increasing use in medicine and healthcare raises critical ethical concerns around privacy and fairness, as models trained on sensitive patient data risk breaches and biased decision-making. Federated Learning (FL) allows collaborative training without centralizing data, aligning with HIPAA and GDPR, and can promote fairness by learning from diverse, distributed datasets. Yet, computational resource heterogeneity across institutions introduces implicit fairness issues: traditional homogeneous FL with a single large global model excludes weaker clients or underutilizes stronger ones, while heterogeneous FL still leaves performance gaps between strong and weak clients. Additional challenges include large communication overhead and vulnerability to gradient inversion attacks under uniform architectures. The paper articulates five principles for a fair and sustainable FL system: equal opportunity, fair contribution, shared fruits, equal model test accuracy, and sustainability. Motivated by these challenges, the work seeks an FL framework that adapts to diverse local model capacities, enables effective lossless knowledge exchange across heterogeneous models, and ensures fairness, privacy, and efficiency.
Literature Review
Traditional FL such as FedAvg assumes uniform model capacity, creating participation and efficiency trade-offs. Heterogeneous FL strategies include knowledge distillation-based methods (DSFL, FedET, IncluFL, FedMD, FedDF) that emphasize communication efficiency and personalization, and network pruning-based techniques (HeteroFL, DepthFL, FedRolex) that adapt model width/depth for heterogeneous resources. Recent methods FCCL, pFedHR, and FedTGP further address heterogeneity. However, distillation generally cannot transfer knowledge losslessly across different architectures, leading to accuracy degradation, and pruning can cause feature-space mismatch between sub-networks and full networks. Privacy vulnerabilities also persist due to uniform architectures enabling gradient inversion attacks.
Methodology
The paper proposes Dynamic Federated Learning (DynamicFL), a unified framework enabling fairness-aware, privacy-preserving collaboration among clients with diverse computational resources through lossless re-parameterization.
- Overall workflow: A server initializes and broadcasts a plain global model (e.g., VGG-style CNN). Each client adapts the model to its resource budget via structural re-parameterization to form a multi-branch local architecture, trains locally, then re-parameterizes back to the original global architecture for aggregation.
- Heterogeneous Local Training: Clients dynamically expand local models to multi-branch structures based on available compute, selecting operations that maximize performance contribution to balance accuracy and efficiency. Gradients are recorded to quantify branch sensitivity to local and global knowledge: S_i^D (local dataset gradient) and S_i^E (global aggregation sensitivity). Using these, clients modulate local structures with DYMM, removing redundant branches and expanding important ones to fully utilize resources. Training proceeds for E epochs, with provisions for clients experiencing reduced resources to perform simplified re-parameterization.
- Homogeneous Global Aggregation: After local training, each client transforms its local model back to the original global architecture via equivalent re-parameterization, ensuring all uploaded models share the same structure. The server performs simple averaging to update the global model, avoiding server-side distillation and preserving knowledge without transfer overhead.
- Dynamic Model Modulation (DYMM): Importance and redundancy of branches are measured via gradient-based salience metrics on local and re-parameterized models. Redundant branches (low local and global salience) are merged; important branches (low local but high global salience) are expanded while preserving output equivalence. For transformers, a fixed local re-parameterization strategy is maintained for stability unless resources change.
- Lossless Knowledge Transfer: Re-parameterization leverages additivity of convolutions (I ⊗ F^(1) + I ⊗ F^(2) = I ⊗ (F^(1)+F^(2)) ) and ability to convert pooling and normalization to equivalent convolutions. For transformers, linear layer additivity and absorption of normalization enable expansion/merging of parallel branches while maintaining identical input-output mappings. Thus, outputs remain unchanged during structural transformations, ensuring lossless knowledge transfer and identical test accuracy across clients.
- Privacy protection: Because local models are heterogeneous and their exact architectures are unknown to the server, gradient inversion attacks are impeded; the server cannot reconstruct client data from gradients as in vanilla uniform FL.
- Convergence analysis: Under standard FL assumptions (L-smoothness, bounded gradient variance), re-parameterization yields virtual sequences with bounded deviation from global-architecture parameters. Theoretical results show DynamicFL achieves the same convergence rate as FedAvg, O(1/√(NT)), with empirical evidence of faster and more stable convergence due to avoiding lossy cross-model knowledge transfer.
- Algorithm: The paper provides Algorithm 1 specifying communication rounds, local training steps with DYMM, re-parameterization procedures for local and global models, and averaging aggregation.
Key Findings
Across extensive experiments, DynamicFL demonstrates superior accuracy, fairness, privacy robustness, and scalability.
- Equal wall-clock time on medical datasets (Table 1):
• CancerSlides: DynamicFL 87.37 ± 0.86% (IID) vs best baseline FCCL 81.83% (+5.54); 80.26 ± 0.57% (Non-IID) vs FCCL 74.65% (+5.61).
• ChestXray: DynamicFL 87.92 ± 0.79% (IID) vs FCCL 82.52% (+5.40); 83.41 ± 0.92% (Non-IID) vs pFedHR 76.46% (+6.95).
• BloodCell: DynamicFL 88.64 ± 0.44% (IID) vs FedRolex 82.73% (+5.91); 81.76 ± 0.71% (Non-IID) vs FCCL 75.41% (+6.35).
- Performance under diverse client distributions (Table 2): DynamicFL consistently outperforms baselines across ResNet-18 and ViT-Base for ratios 7:2:1, 5:2:3, 4:1:5, 4:3:3, 3:6:1. For ResNet-18, DynamicFL achieves 82.12%, 83.34%, 80.12%, 83.45%, 81.23%, exceeding best baselines by large margins (e.g., +7.67% over pFedHR at 7:2:1). For ViT-Base, DynamicFL achieves 80.45%, 81.34%, 78.89%, 82.45%, 79.67%.
- Generalization across CNNs (Table 3): On LeNet-5, GoogLeNet, and ResNet-18, DynamicFL attains top accuracies across CIFAR-10/100 and BloodCell in IID and Non-IID, while using smaller global models (e.g., LeNet-5 global parameters 0.6M vs 1.8M for baselines; ResNet-18 global parameters 11.7M vs 35.1M), reducing communication overhead.
- Transformer scalability (Table 4): On BloodCell and CancerSlides, DynamicFL scales from ViT-Tiny to ViT-1B. BloodCell IID accuracies: 73.11 (Tiny), 82.52 (Small), 87.02 (Base), 94.77 (1B); Non-IID (Dirichlet 0.7): 65.39, 76.09, 85.09, 91.26. CancerSlides IID: 79.64 (Tiny), 83.21 (Small), 85.52 (Base), 92.64 (1B); Non-IID: 77.45, 81.34, 82.42, 89.53.
- Fairness among clients: DynamicFL yields identical test accuracy for strong, medium, and weak clients, unlike baselines (e.g., FedET shows a 6.31% higher accuracy on strong vs weak clients), ensuring fair sharing of training outcomes.
- Robustness under varying Non-IID (Table 5): DynamicFL consistently achieves highest accuracy for ResNet-18 and ViT-Base across β = 0.1–0.9, with larger gains in highly heterogeneous settings (e.g., ResNet-18 +2.49% at β=0.1; +1.73% at β=0.5; ViT-Base +4.31% at β=0.1; +2.56% at β=0.5).
- Scalability with more clients (Table 6): With 60/120/180 clients, DynamicFL maintains superiority. Examples: ResNet-18 on BloodCell at 180 clients: 72.69% (DynamicFL), exceeding best baselines by 3.24–8.91%; CancerSlides at 180 clients: 76.59%. ViT-Base on BloodCell at 180 clients: 75.39% vs best baseline 68.34% (+7.05).
- Convergence: DynamicFL converges faster and more stably than distillation/pruning baselines due to lossless cross-model knowledge transfer, achieving higher final accuracy.
- Privacy: Against gradient inversion attacks, vanilla FL exhibits significant data leakage, whereas DynamicFL’s server-unaware local architectures hinder reconstruction, enhancing privacy.
Discussion
DynamicFL directly addresses fairness and privacy issues caused by resource heterogeneity in healthcare FL. By enabling clients to train at full capacity and then losslessly re-parameterize to a common global architecture, DynamicFL equalizes test accuracy across resource tiers, meeting the paper’s fairness principles (equal opportunity, fair contribution, shared fruits, equal accuracy, sustainability). The smaller global models reduce communication overhead while preserving or improving accuracy. Theoretical analysis shows convergence comparable to FedAvg, with empirical faster, more stable convergence. Privacy is strengthened via architectural heterogeneity and server-side ignorance of local models, impeding gradient inversion attacks without heavy cryptographic overhead. While a “no free lunch” perspective suggests trade-offs among utility, privacy, and efficiency, DynamicFL demonstrates simultaneous improvements across these objectives, positioning it as a robust solution for equitable, trustworthy AI in healthcare.
Conclusion
The paper introduces DynamicFL, a resource-adaptive, fairness-aware, and privacy-preserving federated learning framework that ensures lossless knowledge transfer across heterogeneous local models and homogeneous aggregation into a lightweight global model. Extensive experiments across medical and benchmark datasets and diverse architectures show superior accuracy, fairness among clients, robustness to Non-IID distributions, enhanced privacy, faster convergence, and reduced communication overhead. Future work will explore asynchronous, model-heterogeneous FL to alleviate synchronization constraints and adaptive scheduling mechanisms to further reduce communication overhead for real-world deployments.
Limitations
Limitations include the need for synchronized communication among participating institutions in current experiments, which may hinder real-world asynchronous settings. Frequent parameter transmission introduces communication overhead despite the lightweight global model. The absence of direct baseline comparisons for very large transformers (e.g., ViT-1B) may bias conclusions about scalability. Structural re-parameterization adds local computation costs. Privacy benefits rely in part on the server’s lack of knowledge of client architectures. Clients with dynamically reduced resources may require simplified steps, potentially affecting local optimization efficiency.
Related Publications
Explore these studies to deepen your understanding of the subject.

