Invisible Alignments: Data Privacy and Federated Learning in Global Health Surveillance
- christopherstevens3
- Jun 18
- 40 min read
Updated: Jun 24

Introduction
In recent years, the rise of artificial intelligence (AI) has sparked a transformative shift in global health surveillance. From real-time disease monitoring and outbreak detection to improved resource allocation, AI-powered systems, such as the World Health Organization's (WHO) outbreak intelligence platforms and diagnostic tools, have become indispensable in today's public health arsenal (2025). For example, networks like HealthMap (2025) and Canada’s Global Public Health Intelligence Network (2025) now harness AI to sift through online reports, social media, and medical data. They flag emerging threats well before traditional mechanisms intervene.
However, these benefits come with significant challenges in data governance and data privacy. Centralized health surveillance systems often rely on aggregating sensitive, individual-level health data (WHO, 2017). This activity can pose privacy risks, data sovereignty concerns, and ethical dilemmas. Historically, digital surveillance tools have been scrutinized for their potential to perpetuate inequities or infringe on civil liberties, unless properly regulated. Against this backdrop, federated learning (FL) has emerged as a compelling alternative.
FL enables model training across disparate, decentralized data sources, such as hospitals or national health databases, without requiring the transfer of raw data. This approach maintains data localization while still yielding a globally optimized model (Sadilek et al., 2021). Pioneering studies applying FL to clinical datasets have shown that privacy-sensitive research can be conducted effectively on disease patterns, imaging, and epidemiological surveillance without compromising individual privacy. However, while FL offers a promising remedy to centralization issues, it is not a panacea.
As federated systems expand across borders and institutions, they surface a new spectrum of legal, operational, and jurisdictional complexities. Different countries maintain varying data privacy and data protection frameworks, and the implementation requirements differ widely. Aspects like cryptographic aggregation and software governance can introduce vulnerabilities. In public health surveillance contexts, where FL might involve sharing updates among hospitals in multiple countries, these tensions are amplified.
This article guides readers through the legal, technical, and ethical dimensions of FL as applied to global health surveillance. It offers a structured pathway from foundational concepts to jurisdiction-specific compliance strategies and forward-looking policy solutions. It examines FL in the context of data privacy, data protection, and global health surveillance.
Additionally, it explores how FL’s decentralized design can mitigate some risks associated with centralized data. Additionally, it examines how FL can create complex institutional, legal, regulatory, and technical challenges. FL offers substantial advantages in terms of data privacy, data protection, and data sovereignty. It also requires careful navigation using effective legal and regulatory frameworks, best operational practices, and international coordination. Note: Please review Appendix 1’s “Glossary of Key Terms” to review the terminology used throughout the article.
Why Federated Learning in Global Health Surveillance Requires Immediate Attention
Despite the absence of public enforcement actions, FL in global health surveillance is already under active scrutiny from institutional researchers, policymakers, and regulators. Its relevance lies not in retrospective penalties, but in the immediate challenges posed by real-world deployments. FL, as it applies to data privacy and global health surveillance, remains a topic of interest for the following reasons:
Anticipating Regulation Before Harm Emerges: Although FL is often lauded for minimizing raw data exposure, it does not eliminate privacy risks. Scholars have identified vulnerabilities, including gradient leakage, model inversion, and metadata exposure. It has prompted calls for proactive legal and regulatory alignment even in the absence of formal enforcement mechanisms (Rieke et al., 2020; Kairouz et al., 2021). Waiting for legal penalties before addressing FL’s design limitations risks avoidable harm in sensitive health contexts.
Cross-Border Legal Tensions Already Surfacing: FL crosses national boundaries by design. Even though it limits direct data transfers, model updates may still constitute personal data under laws and regulations such as Brazil’s General Data Protection Law (LGPD), China’s Personal Information Protection Law (PIPL), and the European Union’s General Data Protection Regulation (EU GDPR), triggering localization rules, consent requirements, and ambiguity regarding data controller and data processor roles (Kaissis et al., 2020; Lieftink et al., 2024). These cross-border tensions underscore the need for clarified compliance pathways.
Regulatory Alignment with Federated Architecture:
Despite these jurisdictional frictions, FL's technical architecture offers inherent benefits that align with the objectives of data privacy and data protection. Training models locally and exchanging only encrypted or aggregated updates supports compliance with data minimization mandates under Article 5(1)(c) of the EU GDPR, Section 8 of India’s DPDPA, and Article 6 of Brazil’s LGPD (IAPP, 2020; Intersoft Consulting, 2025a; Nishith Desai Associates, 2025).
FL’s design also reduces reliance on cross-border raw data transfers, aligning with localization rules under China’s PIPL and similar frameworks. This makes FL appealing to regulators seeking data sovereignty and legal compatibility in collaborative health initiatives.
Regulators Are Paying Attention: FL is no longer operating in a legal or regulatory vacuum.
The European Health Data Space (EHDS), effective March 26, 2025, codifies federated infrastructures for the secondary use of health data across borders, requiring secure, privacy-preserving systems by design (European Commission, 2025).
Likewise, the DARWIN EU platform, led by the European Medicines Agency since 2022, operationalizes FL for pharmacovigilance, retaining data locally and sharing only aggregated results in line with EU GDPR standards (DARWIN EU, 2025; European Medicines Agency, 2025).
In Canada, the Public Health Agency (PHAC)’s Blueprint and Vision 2030 endorse a federated, interoperable data system to support epidemic response and cross-jurisdictional surveillance (Government of Canada, 2025; Government of Canada, 2016).
However, researchers have shown that even without raw data transfers, FL systems can leak sensitive information through model gradients, labels, or metadata, highlighting that FL is not inherently privacy-safe and requires rigorous privacy engineering from the outset (Geiping et al., 2020; Zhu et al., 2019; Jiang et al., 2025).
Ethical and Societal Stakes Are Rising: As FL expands into genomics, mental health analytics, and global outbreak detection, the ethical stakes rise in parallel. Unintended misuse could compromise civil liberties, damage international cooperation, or erode public trust. These concerns require governance frameworks that extend beyond compliance, prioritizing fairness, explainability, and data sovereignty (Dayan et al., 2021; Secure Privacy, 2025).
In sum, while FL has yet to face formal enforcement actions in the context of global health surveillance, it is already under meaningful regulatory and institutional review. Proactive assessment and regulation are critical because the ethical, legal, and operational stakes of FL are not theoretical. They are already here.
Federated Learning in Health Surveillance: A Primer
Emerging at the intersection of cutting-edge AI and sensitive health data, FL offers a transformative approach for building collaborative surveillance models that theoretically preserve individual privacy. This primer explores its mechanics, real-world applications, and why FL inherently aligns with the goals of public health surveillance.
FL reimagines AI model training. Individual institutions — whether hospitals, public health agencies, or edge devices — train local model updates (weights or gradients) on-site, rather than pooling raw data centrally. These updates are then sent to a central aggregator, which synthesizes them into a global model and redistributes them for further refinement (NVIDIA, 2021).
FL supports two architectures:
Centralized FL: A coordinating server aggregates model updates and redistributes the updated global model (Li et al., 2025).
Decentralized FL: Peers collaborate via consensus protocols or multi-party computations without a single central authority (Li et al., 2025).
This architecture is resilient to data heterogeneity. Medical datasets vary widely across institutions, in both size and statistical distribution. FL's aggregation algorithms (e.g., Federated Averaging, etc.) are designed to handle such diversity (Hosseini, 2022; MacMahan et al., 2017).
Figure 1 illustrates the FL workflow across decentralized health institutions. Each node trains a local model without sharing raw data, then transmits encrypted model updates to a central aggregator. The global model is created by combining these updates and is redistributed back to participants. This privacy-preserving cycle enables collaborative AI development without compromising data sovereignty or violating cross-border data privacy and data protection laws and regulations.
Figure 1. How FL Works in Health Surveillance

Public Health Use Cases: FL is no longer just theoretical. It has already shown real-world impact across a spectrum of public health initiatives:
COVID‑19 Prognosis with the EXAM Model: A landmark multicenter FL study developed the EXAM model (EMR + Chest X-ray AI Model), training on health data from 20 hospitals worldwide. Inputs include vitals, lab results, and chest X-rays. The model predicted oxygen needs within 24 and 72 hours, achieving AUC > 0.92. Remarkably, it outperformed single-site models, boosting generalizability by ~38% (Dayan et al., 2021).
Privacy-Preserving Medical Imaging: Researchers applied differentially private FL to histopathology images using The Cancer Genome Atlas Program dataset. The collaborative model achieved parity with centralized models while implementing robust privacy protections (National Cancer Institute, 2025). Similar frameworks support segmentation of whole-slide images using secure multi-party computation (Hosseini, 2022).
Syndromic Surveillance Across Jurisdictions: FL enables syndromic surveillance, such as tracking emergency department admissions and “over-the-counter” medication purchases (Henning, 2004). It also tracks self-reported symptoms in a federated fashion. Regions contribute model updates, preserving sovereignty while fueling global trend analyses.
Why FL Is a Natural Fit for Global Health Surveillance: FL offers three synergistic benefits in public health surveillance:
Legal Compliance & Data Minimization: Since raw data remains local, FL aligns naturally with data privacy and data protection frameworks (e.g., EU GDPR, Health Insurance Portability and Accountability Act (HIPAA), etc.), upholding key mandates such as data minimization (Kairouz et al., 2021).
Reduced Ethical Burden: By limiting data exposure, FL diminishes reliance on elaborate sharing agreements and intensive consent protocols, though encryption and differential privacy are still vital to mitigate risks like gradient leakage (Hosseini, 2022).
Respecting Data Sovereignty: Institutions maintain control over their data, reducing concerns over cross-border data transfers (Choudhury, 2019).
Observation: While FL aligns naturally with legal and ethical priorities in surveillance, its operational and legal complexity grows as models scale, especially across jurisdictions. The following section examines key tensions arising from FL’s decentralized architectures, focusing on real-world implementation obstacles, cross-border compliance challenges, and the risk landscapes that emerge when local data infrastructures must collaborate at scale.
Misconceptions of Built-in Privacy in FL
FL is often pitched as a “privacy-first” solution simply because raw data remains local. However, framing FL as inherently privacy-safe is misleading. Below, we unpack the core misconceptions and reveal why FL still exposes sensitive information through gradients, models, and metadata.
Hidden Risks in FL: Although FL minimizes direct exposure of raw data, it does not eliminate privacy threats. Sensitive information can leak through indirect channels such as model updates, gradients, or metadata. Table 1 summarizes FL gradient leakage and privacy attack vectors:
Table 1. Gradient Leakage and Privacy Attack Vectors in Federated Learning
Attack Type | Method Description | Risk Level | Targeted Data | Common Mitigations |
DLG (Deep Leakage from Gradients) | Uses optimization and backpropagation reversal to reconstruct training samples from shared gradients. | High | Raw input images or text | Differential Privacy (DP), Secure Aggregation |
FGL (Faster Gradient Leakage) | Reconstructs private data in seconds using optimization (e.g., cosine similarity); more efficient than DLG. | Very High | Input features and labels | DP with tighter ε, gradient clipping |
Label Inference (LI) | Infers class labels from gradients, even when model updates are encrypted. | Medium–High | Class labels | DP, model sparsity, label obfuscation |
Metadata Leakage | Exploits timing, update frequency, or round differences to infer client identity or data properties. | Medium | Client identity, participation | Metadata padding, update masking |
Model Inversion | Uses access to the global model and gradients to synthesize or approximate training inputs. | High | Input features, visual patterns | Limited model access, DP, encrypted computation |
What Research Reveals - FL’s Privacy Vulnerabilities:
While FL offers a promising alternative to centralized data collection by keeping raw data on-device, it is not a panacea for privacy. An expanding body of empirical research has revealed critical vulnerabilities that challenge the assumption that FL inherently ensures data protection.
Numerous studies have shown that gradients, model updates, and auxiliary metadata exchanged during training can be reverse-engineered. They could potentially expose sensitive features, class labels, and, in some cases, re-identifying individuals (Geiping et al., 2020; Truex et al., 2019; Zhu et al., 2019).
These findings raise serious concerns, particularly in high-risk domains such as public health surveillance, where even minimal data leakage could compromise patient confidentiality or lead to public misinformation. Before deploying FL in such environments, it is essential to rigorously assess its privacy guarantees, threat models, and mitigation strategies (Kairouz et al., 2021).
FL, while designed to enhance privacy by keeping raw data decentralized, remains vulnerable to a range of sophisticated attacks. Recent empirical studies have demonstrated that model gradients, even when aggregated or encrypted, can still leak sensitive information.
Table 2 below summarizes the key attack vectors that challenge the perceived privacy guarantees of FL systems, particularly in high-risk domains such as healthcare and critical infrastructure.
Table 2: FL Privacy Vulnerabilities
Vulnerability | Description | Research Source |
Gradient reconstruction attacks | Methods like DLG, iDLG, DLG-FB, SPEAR, and FedLeak recover raw training data by inverting shared gradients. They can even recover them from multi-image batches. | DLG, iDLG, DLG‑FB; SPEAR exact batch inversion; FedLeak on real-world protocols |
Rapid leakage on real-world data | Tools like FedLeak can reconstruct high-resolution images from large datasets in realistic FL setups. | FedLeak empirical evaluation |
Label inference under secure aggregation | Even encrypted gradients via Secure Aggregation (SA) can leak labels through methods like LIA-SA and batch label restoration. | LIA-SA attack; ICLR batch label restoration |
Metadata enables re‑identification | Metadata patterns across updates can expose individual clients or institutions, even under SA. | Category inference & pattern clustering |
These findings dismantle the myth that "local data = safe data." In truth, model updates leak much more than is often acknowledged. What follows is essential: strategies to safeguard against these threats, such as secure aggregation, differential privacy, trusted execution environments, and metadata obfuscation.
Table 3 helps readers visually match known FL privacy and compliance risks, such as gradient leakage and label inference. They correspond to the technical mitigations, including audit trails, blockchain, DP, and SMPC.
Table 3: FL Compliance Risks and Corresponding Technical Safeguards
Risk Type | Differential Privacy (DP) | Secure Multiparty Computation (SMPC) | Blockchain Logging | Audit Trails / Provenance |
Gradient Leakage (DLG, FGL) | ✅ Strong mitigation (noise added to updates) | ✅ Protects updates in transit | ⚠️ Indirect support via tamper-evident logs | ⚠️ Helpful post-event, not preventative |
Label Inference (LI) | ✅ Obscures training labels in gradients | ✅ Secures label-associated updates | ⚠️ Traceable update paths but no direct protection | ✅ Can document exposed labels if breached |
Metadata Leakage | ⚠️ Partially mitigated with obfuscation | ⚠️ Does not directly address timing leaks | ✅ Secures update timing/history | ✅ Critical for post-hoc analysis and node traceability |
Model Inversion | ✅ Limits data reconstruction from outputs | ⚠️ Offers no output-level protection | ⚠️ Secures lineage but not inversion directly | ✅ Enables tracing of impacted models/updates |
Cross-Client Re-identification | ⚠️ Only partial (not structural mitigation) | ✅ Secures node-specific update content | ✅ Records node update history | ✅ Tracks model lineage to source nodes |
General Challenges and Risks in FL
Although FL offers a compelling alternative to centralized data processing, it introduces significant risks that span technical, ethical, and operational domains. The following subsections examine five critical challenge areas: security, bias, infrastructure, governance, and transparency. Practitioners must address these challenges to ensure FL’s safe and scalable adoption. Together, these challenges underscore that the success of FL relies not only on algorithmic innovation but also on robust governance, cross-disciplinary oversight, and resilient system design.
Bias - Unequal or Non-Representative Node Data: FL inherently faces the following bias risks due to the heterogeneity of data across nodes:
Data Bias: Local datasets may reflect subpopulation (e.g., one hospital treats a specific demographic), leading to global models that are misaligned with underserved groups (Mukhtiar et al., 2025).
Participation Bias: Nodes may drop out or selectively participate in training rounds; when underrepresented or minority nodes disengage, the global model may drift away from their data distribution, leading to performance degradation and representational bias (Li et al., 2020; Mohri et al., 2019).
Observation: Bias does not just reduce accuracy; it undermines fairness, equity, and trust. Addressing this requires careful weighting, fairness-aware aggregation, and regular audits to detect and mitigate bias.
Governance - Unclear Control and Audit Limitations: FL’s decentralized design presents real-world governance challenges related to accountability and auditability. These are reflected in the following key points:
Auditing Challenges: Without centralized datasets, verifying input provenance is challenging, yet compliance regimes such as the EU GDPR and HIPAA mandate audit trails and traceability (European Data Protection Supervisor, 2025; Secure Privacy, 2025).
Control Ambiguity: Responsibility for model behavior and outcomes in FL remains ambiguous. It raises unresolved legal and ethical questions about whether accountability lies with the participating client organizations, the central aggregator, or the system designers (Woisetschläger et al., 2024).
Liability and Regulatory Alignment: Under legislation such as the EU Artificial Intelligence Act, FL arrangements require both data controllers and data processors to define their roles, responsibilities, and accountability frameworks explicitly. This challenge remains under active legal and technical scrutiny (Rieke et al., 2020; Woisetschläger et al., 2024).
Infrastructure - Edge Devices & Network Latencies: Deploying FL across a wide range of devices and network conditions presents significant infrastructure challenges that directly impact model performance, scalability, and system resilience.
Computational Constraints: Edge devices such as smartphones and Internet of Things (IoT) sensors typically have limited processing power, memory, and battery life, which restrict their ability to support resource-intensive model training tasks (Aldrees et al., 2025; Sah & Fotouhi, 2025). These constraints necessitate the use of lightweight models, quantization, or split learning to ensure local feasibility.
Communication Bottlenecks: Frequent model updates in FL can overwhelm low-bandwidth or unstable networks, resulting in latency, synchronization delays, and client dropout. It occurs in mobile, industrial IoT, and multi-access edge computing (MEC) environments where connectivity is often intermittent and infrastructure is limited (Kasarla et al., 2025; Sah & Fotouhi, 2025).
Dynamic Resource Management: Recent work has shown that incorporating mutable resource allocation and asynchronous scheduling in edge-adaptive FL frameworks effectively mitigates communication backlogs. It also improves convergence across heterogeneous nodes (Aldrees et al., 2025).
Security - Model Poisoning & Adversarial Clients: FL environments are vulnerable to deliberate interference by malicious participants. Two prominent threats include:
Byzantine & Adversarial Attacks: Compromised clients can send inconsistent or malicious updates to degrade model accuracy, cause convergence failures, or inject targeted behavior even without altering training labels. These threats are particularly damaging in heterogeneous FL, where traditional aggregation methods, such as Krum and median, fail to perform reliably. Emerging techniques, such as gradient splitting, provide enhanced robustness in these settings (Liu et al., 2023).
Model Poisoning Attacks: Malicious clients may inject carefully crafted updates to corrupt the global model. They can do so by either degrading overall performance, embedding stealth backdoors, or using tactics like label flipping and false‐gradient injection (Almutairi & Barnawi, 2024; Li et al., 2023).
Observation: These threats undermine the integrity of FL systems, where even a single compromised client can significantly bias outcomes. Mitigation requires Byzantine-resilient aggregation and client anomaly detection frameworks such as FLDetector and FedCC (Zhang et al., 2022).
Transparency - Complex Traceability & Explainability: As FL models increase in size and complexity, it becomes more difficult to understand how they generate decisions. This growing opacity raises serious concerns about transparency, accountability, and oversight.
Design Imperatives: To address these issues, FL systems should integrate data-lineage logging and node-level metadata tracking. Additionally, provenance-aware explainability frameworks, such as Provenance-Enabled Explainable AI (Zhang & Yu, 2024), can embed auditability directly into model architectures.
Explainability Limitations: With training distributed across opaque nodes, interpreting feature contributions or diagnosing the origins of bias is especially challenging. Recent surveys confirm that current FL frameworks struggle to provide robust guarantees of explainability (Li et al., 2023).
Traceability Gaps: When models produce unexpected outputs, tracing the lineage of their decisions becomes difficult. Identifying which client data influenced a particular result is often unclear, highlighting a significant gap in data provenance (Gu et al., 2024).
Cross-Border Data Privacy and Data Protection Challenges
As FL systems scale across jurisdictions, they confront legal, operational, and policy mismatches. This section examines how key regions—spanning Africa, Asia, the EU, the Americas, and the UK—differ in their approaches. These approaches include data sovereignty, localization mandates, consent frameworks, and the roles of the data controller and the data processor. Understanding these legal and regulatory tensions is essential for designing lawful, interoperable, and scalable FL infrastructures in public health.
Global deployment of FL in health surveillance must navigate a complex web of national regulations. While FL’s decentralized design offers clear benefits, it also collides with diverse privacy and data sovereignty norms. Below, we explore key jurisdictional dynamics and tensions that illustrate why FL is not a one-size-fits-all solution for compliance.
Africa - Data Sovereignty Takes Center Stage: Across Africa, data sovereignty is not just a buzzword. It is a central legal and regulatory pillar. Dozens of nations mandate that health data remain within their borders, governed under national laws (European Data Protection Supervisor, 2025). This localization ethos risks conflicting with external FL aggregation processes, which often rely on centralized model servers. For pan-African health initiatives, sophisticated federated setups with aggregation nodes located within each country are essential. They preserve compliance while enabling collaboration.
Brazil (LGPD) - Sensitive Data & Localization: Under Brazil’s LGPD, health data is classified as sensitive personal data, triggering stricter legal treatment than regular personal data (IAPP, 2020). Though FL works locally, transmitting model updates may still constitute “processing” under the law and require explicit consent or a legal basis. Furthermore, LGPD’s localization rules restrict transfers outside Brazil. This restriction can pose practical hurdles for cross-border federated training that lacks local aggregation nodes or consent for international transfer.
China (Cybersecurity Law (CSL), Data Security Law (DSL), PIPL) - Model Updates = Personal Data: China’s data regime under the CSL, DSL, and PIPL treats almost any personal information, including model parameters, as regulated data (DLA Piper, 2025). PIPL enforces strict localization and compliance for “important data,” which covers health sector information. The cross-border transmission of model updates triggers compliance burdens, including data localization, security assessments, explicit user consent, and potential blacklisting. FL architectures must accommodate these by isolating Chinese FL nodes or operating fully within China.
EU GDPR - Data Minimization vs. Control Ambiguities: The EU GDPR embraces data minimization, making FL structurally appealing (European Data Protection Supervisor, 2025). However, ambiguities remain around:
Data Controllers/Data Processors: Which party governs the federated process?
International Transfers: Even non-personal data may be considered “personal data” under specific interpretations when model updates can be traced back to individuals.
Observation: Thus, federated systems must craft precise control agreements, secure transfer mechanisms (e.g., standard contractual clauses, etc.), and maintain robust transparency to remain compliant with the EU GDPR and similar data privacy and data protection laws and regulations (Lieftink et al., 2024).
India – Digital Personal Data Protection Act (DPDPA) - Deemed Consent & Transparency Mandates: India’s DPDPA introduces lawful exemptions, functionally like “deemed consent,” for processing personal data in public health contexts, such as during epidemics or medical emergencies (see Section 7[f] and [g]) (Nishith Desai Associates, 2025). These exemptions may facilitate FL in health crises. However, the Act preserves transparency obligations: entities must communicate their processing purposes, individuals’ rights (including the right to opt-out where applicable), and data safeguards. Additionally, systems must implement appropriate technical and organizational measures to ensure that output remains aggregated and secure, preventing the re-identification of personal data, in alignment with the Act’s overarching data protection standards (Nishith Desai Associates, 2025).
United Kingdom (UK) – UK GDPR and National Health Service (NHS) Federated Pilots: Following Brexit, the UK operates under the UK GDPR and the amended UK Data Protection Act (DPA), which mirrors the EU GDPR framework. The NHS is actively piloting FL through its Federated Data Platform (FDP), which is utilized in projects such as genomic surveillance and predictive analytics, by embedding privacy-by-design principles through transparency-enhancing technology, de-identification protocols, and strict governance frameworks (NHS England, 2025a). However, legal clarity around data controller roles (including joint controller arrangements between NHS England and local trusts), comprehensive data‑sharing agreements, Data Protection Impact Assessments (DPIAs), and transparent output practices remain critical to sustaining trust and ensuring compliance under the UK GDPR and UK DPA (NHS England, 2025b).
United States:
In the U.S., the HIPAA Privacy Rule (45 CFR §164.512(b)) authorizes covered entities to disclose protected health information to public health authorities for surveillance, disease control, or public health investigations without individual authorization, which supports FL in health contexts (U.S. Department of Health and Human Services, 2003). However, emerging U.S. state-level privacy laws, notably California’s California Consumer Privacy Act as amended by the Consumer Privacy Rights Act, give individuals the right to limit use of sensitive personal information for inference purposes (Bonta, 2024).
Washington’s My Health My Data Act, which prohibits the processing or sharing of consumer health data without clear, opt-in consent (Stoel Rives LLP, 2023), imposes stricter controls on health-related inferences and data usages. Consequently, federated models must ensure that their outputs do not inadvertently expose sensitive health insights that would contravene these U.S. state laws, even where HIPAA permits the underlying data exchange.
Observation: These jurisdictional snapshots reveal that FL, while technically promising, must be tailored to comply with a spectrum of national data privacy and data protection regimes. Universal FL deployment faces friction, from outright data localization to nuanced consent rules. By mapping FL challenges to real-world legal landscapes, this section invites readers to consider the practical steps necessary for lawful and ethical federated systems, particularly when public health relies on global data integration.
Observation: FL systems must comply with a diverse set of legal and regulatory frameworks governing data transfers, consent, and processing. Table 4 compares legal and regulatory conditions across key jurisdictions, highlighting implications for FL deployment in public health surveillance.
Table 4. Cross-Border Legal and Regulatory Requirements for Federated Learning
Country/Region | Key FL Legal Issues | Localization Rules | Consent Model | Relevant Law(s) |
Australia | FL systems must comply with Software-as-a-Medical-Device (SaMD) standards and privacy under the Australian Privacy Principles | No mandatory localization, but restrictions apply to sensitive data under APP 8 | Implied or explicit consent for health data | Australian Privacy Act; Digital Health Agency Standards |
Brazil | Model updates may qualify as processing of sensitive health data | Health data localization encouraged; cross-border transfer requires legal basis | Explicit consent unless legal exemption applies | LGPD (Lei Geral de Proteção de Dados) |
Canada | Federated surveillance architecture endorsed; privacy laws vary by province | No national localization law; some provincial health authorities restrict data export | Consent typically required; public health use may trigger exemptions | PHIPA (Ontario); PHAC Blueprint |
China | Model updates (gradients, weights) are considered personal data | Strict localization for “important data,” including health data | Explicit consent required; cross-border transfers need security assessments | PIPL, DSL, CSL |
European Union | Updates may be considered personal data; joint-controller roles must be defined | Localization not mandatory, but transfers require adequacy or SCCs | Consent or public health basis under GDPR Article 9(2)(i) | GDPR; EHDS; DARWIN EU |
India | Public health processing permitted under deemed consent | No general localization rule for health data, but public sector may restrict transfers | Deemed consent allowed for health crises; transparency required | DPDPA (2023) |
Japan | FL is aligned with national data-sharing goals under the 2024 health strategy. | Data anonymization required for secondary use; pseudonymization permitted | Opt-out permitted with proper notification | Next-Gen Medical Infrastructure Law; APPI |
United Kingdom | Federated pilots underway; data controller roles must be clearly defined | No localization requirement post-Brexit, but safeguards needed for transfers | Consent or legal basis; DPIAs strongly recommended | UK GDPR; DPA 2018 |
United States | HIPAA permits data use for public health; state laws restrict health inference | No federal localization rule: U.S. state laws vary | Opt-out under CCPA as amended by CPRA; opt-in under WA’s MHMD Act | HIPAA; CPRA (CA); MHMD Act (WA) |
Surveillance-Specific Regulatory Systems Supporting or Influencing Federated Learning
To succeed in real-world deployment, FL must align with existing digital health and AI oversight regimes. This section examines how national strategies and regulatory systems in countries are integrating federated approaches into public health infrastructure.
FL’s viability in public health surveillance depends heavily on existing legal and regulatory frameworks that guide digital health, data sharing, and AI oversight. The following subsections examine how national and international systems both support and shape the adoption of FL. They highlight the intersection of compliance, innovation, and public health needs.
Australia – Therapeutic Goods Administration (TGA) & Digital Health Privacy Standards: Australia’s TGA and the Digital Health Agency enforce robust privacy and interoperability requirements for approved health tools. These tools include software-as-a-medical-device. Any federated system used in surveillance must adhere to medical device standards, data integrity protocols, and transparency principles regarding privacy, as outlined in the Australian Privacy Principles (Mothukuri et al., 2021).
Canada – Public Health Agency (PHAC) Federated Surveillance Blueprint: Canada’s Public Health Agency (PHAC), through the Pan‑Canadian Public Health Network, published the Blueprint for a Federated System for Public Health Surveillance (Government of Canada, 2016) to advance decentralized, interoperable analytics across jurisdictions for outbreak detection and pandemic readiness. While not regulatory, the Blueprint remains a foundational strategy, reaffirmed by PHAC’s ongoing Vision 2030: Moving Data to Public Health Action (Government of Canada, 2025) initiative, which emphasizes a shared “system-of-systems” architecture for near real-time surveillance and cross-jurisdictional collaboration. Together, these frameworks endorse federated public health models that preserve local data control while enabling secure, standards-based national coordination.
China – National Disease Reporting System (NDRS) and National Medical Products Administration (NMPA) Artificial Intelligence AI Guidelines:
China’s NDRS mandates real-time, centralized case reporting of notifiable diseases across all provinces and administrative levels. It serves as a critical foundation for overseeing public health surveillance (Clark, 2023). Meanwhile, the NMPA has issued comprehensive regulations for AI-enabled medical devices, including guidelines for deep learning decision-support software, classification criteria, lifecycle management, and post-market surveillance (Han et al., 2024).
Observation: While FL is not explicitly mentioned, any FL-based health solution in China must integrate with centralized NDRS workflows and comply with NMPA’s AI-device requirements. They emphasize the importance of aligning with national data pipelines and regulatory standards.
EU – European Health Data Space (EHDS) and Data Analysis and Real-World Interrogation Network EU (DARWIN EU): The EU’s EHDS (European Commission, 2025) enacted in March 2025, pioneers a federated legal framework for secondary health data uses. It mandates secure, interoperable systems that directly align with FL principles, as highlighted in ongoing pilots such as the Innovative Health Initiative’s Integration of Heterogeneous Data and Evidence Towards Regulatory and HTA Acceptance (Innovative Health Initiative, 2023). DARWIN EU, a European Medicines Agency-led federated surveillance network launched in 2022, applies FL to real-world drug safety insights (European Medicines Agency, 2025). DARWIN EU demonstrates operational compliance under EU law.
Japan – Health Data and AI Integration Strategy: Japan is advancing a comprehensive health data integration strategy, reflected in its amended Next‑Generation Medical Infrastructure Law. The law became effective in April 2024, and it formally introduces “pseudonymized medical information” for regulated secondary uses provided patients are notified with opt‑out options (Gardhouse, 2025). This strategy explicitly anticipates utilizing federated and privacy-preserving technologies, such as federated learning (FL), for secure AI-enabled health surveillance and research. It signals strong governmental support for FL solutions aligned with national data protection and medical-device regulations (Cooper et al., 2025; Future of Privacy Forum, 2021; Keohane, 2025).
Observation: Collectively, these regulatory systems, ranging from medical oversight to data governance, create a mosaic of both opportunity and constraint for FL. Some frameworks explicitly support federated techniques, while others require FL to adapt to centralized pipelines and privacy mandates. In the next section, we will explore how FL can be designed to align with these guidance documents, laws, and regulations while maintaining technical robustness and public trust.
Accountability and Auditability in Federated Systems
While FL promotes decentralized and privacy-conscious collaboration, it introduces complex questions around enforcement, auditability, and responsibility. This section examines how current FL systems handle, or fail to handle, accountability across technical, legal, and operational dimensions.
FL supports data privacy-aware collaboration; however, it raises significant governance issues around liability, auditability, and jurisdictional compliance. The following sections outline these challenges and are supported by current, validated research insights:
Enforcement Gaps Across Node Jurisdictions: Cross-border FL systems often lack a precise enforcement mechanism when nodes violate privacy or data standards. There is no unified authority to compel action across different legal regimes, and the standardization of log admissibility is inconsistent, resulting in enforcement blind spots (European Data Protection Supervisor, 2025).
Inconsistent Logging and Versioning: Robust audit trails in FL systems depend on synchronized, tamper-evident logs that capture client participation, model version histories, and data provenance. However, many current implementations lack these standardized logging protocols—undermining compliance with the EU GDPR’s Article 30 (mandating records of processing activities) (Intersoft Consulting, 2025b), HIPAA Security Rule audit controls (45 CFR §164.312(b) and six-year log retention) (National Archives, 2025), and medical-device traceability requirements under frameworks like the EU Medical Device Regulation (European Medicines Agency, 2021).
Research & Proposals on Accountability Mechanisms: Recent innovations aim to integrate auditability, transparency, and traceability directly into FL architectures:
Blockchain-Enabled Logging:
VerifBFL (2025): Implements zk SNARK-based proofs to verify correctness of local training and aggregation without revealing private data, offering integrity and auditability in a decentralized FL environment (Wibmann & Milius, 2024). Other frameworks use smart contracts and blockchain to maintain immutable training logs and secure audit trails (e.g., zk-SNARK ledger enforcement) (Ribeiro et al., 2022).
FederatedTrust: The diagram below illustrates the FederatedTrust governance framework proposed by Sanchez-Sanchez et al. (2024). It is composed of six foundational pillars: Privacy, Robustness, Fairness, Explainability, Accountability, and Federation. Each pillar represents a core governance principle essential for the responsible and trustworthy deployment of FL systems. The framework is operationalized within FL infrastructures such as FederatedScope, where it enables real-time trust scoring across these dimensions. (Sanchez-Sanchez, 2024).
Figure 2: FederatedTrust Governance Pillars

Note: The FederatedTrust framework was developed by Sanchez-Sanchez et al. (2024) and published in Future Generation Computer Systems.
FederatedScope is an open-source federated learning framework developed by Alibaba that supports customizable, event-driven workflows for diverse FL scenarios (FederatedScope, 2023).
It enables integration of privacy-preserving techniques, personalization, and accountability-enhancing tools such as FederatedTrust, which provides real-time trust scoring across governance pillars like fairness, explainability, and robustness (FederatedScope, 2023).
Federated Unlearning: Federated unlearning refers to techniques that allow the removal of specific clients' contributions from an FL model.
It supports compliance with data deletion rights such as the EU GDPR’s Article 17 (“right to be forgotten”) (Intersoft Consulting, 2025c).
Leading methods include “Forgetting Any Data at Any Time” (Wang & Wang, 2025), which offers a certified unlearning framework for vertical FL, and BlockFUL, which uses a blockchain-enabled dual-chain structure to enable retroactive deletion while preserving provenance (Liu et al., 2024).
Note: FL often blurs legal roles under regulations like the EU GDPR. Central servers may act as data controllers, clients may act as data processors, or both may act as joint controllers. Additionally, infrastructure nodes can remain undefined. This ambiguity hinders breach reporting, data-subject access, and efforts related to legal and regulatory compliance (Ning et al., 2024).
Unclear Liability for Data or Model Outputs: When an FL model yields harmful or biased outcomes, such as incorrect medical diagnostics or public-health risk predictions, it becomes challenging to assign responsibility. Is the fault due to a client node submitting biased or malicious data, the central aggregator misprocessing those updates, or the deploying organization applying the model's output? Woisetchalager et al. (2024a; 2024b) examined FL’s compliance and technical challenges as they apply to the EU AI Act’s requirements. Jorstad (2020) examined the challenges of using current legal and regulatory frameworks to assign liability to machine learning errors. Okunu and Okuno (2025) examined the role of global liability protection as it applies to AI service businesses.
When an FL model produces harmful or biased results, pinpointing liability is murky: is fault attributable to a specific client’s data, the aggregator, or the deploying organization? With no established current legal and regulatory precedents, this remains an open risk that requires preemptive governance mechanisms.
Note: FL systems require clear enforcement structures, auditable logs, clear legal roles, and built-in accountability mechanisms to ensure trust, transparency, and regulatory alignment. Recent advances, such as FederatedTrust, blockchain-based logging, and certified unlearning frameworks, offer promising approaches. Unfortunately, sustained legal and policy development remains crucial for the effective deployment of these technologies in the real world.
Recommendations and Roadmap for FL in Global Health Surveillance
This section outlines a structured roadmap for deploying federated learning systems that are compliant, resilient, and privacy-preserving. It organizes guidance into four domains: governance, legal enablement, policy integration, and technical safeguards. Together, they provide actionable steps for AI developers, FL implementers, policymakers, public health agencies, and regulators.
Figure 3 presents the four foundational pillars that structure the roadmap proposed in this section, offering a visual overview of the governance, legal, policy, and technical elements critical to operationalizing FL in public health surveillance.

The roadmap that follows outlines actionable recommendations across four critical domains: Governance, Legal Enablement, Policy Integration, and Technical Safeguards. Each domain is essential for operationalizing FL in public health, with a focus on legal alignment, resilience, and transparency.
· Governance Enhancements: Strong governance is foundational for building trust and clarity in FL systems. Addressing structural transparency, accountability, and role assignment helps mitigate risks before deployment.
o Cross-Border Audit Protocols. To ensure transparency and accountability across nodes in different jurisdictions, organizations should establish mutual audit agreements specifying synchronized logging, agreed-upon audit triggers, and harmonized review cycles. FL’s deployment across multi-cloud and global environments creates unique cross-border compliance challenges. Studies confirm that shared logging mechanisms and coordinated audit frameworks are essential for legal and regulatory alignment (Mia, 2024).
o Defined Roles & Liability: Classify each FL actor as coordinating servers, client nodes, and verifiers. Identify them as a controller, processor, or participant under data protection laws and regulations like the EU GDPR. Ambiguity around these roles has been shown to weaken accountability, underscoring the necessity of pre-defining liability in legal agreements to ensure clear responsibility for data breaches or harmful output (Brauneck et al., 2024).
o The New Federated Learning–specific Privacy Impact Assessment (F-PIA): Traditional PIAs are built for centralized systems and do not address FL’s decentralized structure, iterative model updates, or collaborative execution flows. FL introduces risks like inference leakage, participant collusion, and ambiguous data control that require specialized evaluation. One pattern, the new F-PIA, recommends that all participants jointly assess risks, set standard privacy rules, and validate FL-specific safeguards against federation-wide threats (Information and Privacy Commissioner of Ontario, 2009).
Legal & Regulatory Innovations: The legal environment must adapt to FL’s distributed model, ensuring compliance without stifling innovation. The following regulatory mechanisms can clarify permissions, obligations, and risk boundaries.
Data Protection-by-Design: In alignment with the PbD philosophy, the EU GDPR’s Article 25, mandates Data Protection by Design and by Default, legally requiring data controllers to implement appropriate technical and organizational measures. These measures include data minimization, pseudonymization, and access controls at both the planning and operational stages of data processing (Intersoft Consulting, 2025d) .
Federated Data–Access Rules: Draft legal templates that clearly define permissible model update contents, data retention periods, consent scopes, and withdrawal mechanisms. They should be based on established principles such as the OECD (2016) Health Data Governance Recommendation and Canada’s Pan-Canadian Health Data Strategy (Expert Advisory Group, 2021).
FL Sandboxes: Establish regulatory sandboxes, which are temporary, supervised environments that enable FL pilots to operate under controlled legal and technical conditions. These frameworks have been successfully tested in AI and health data initiatives across the EU and the UK. They align with the OECD’s recommendation to support innovation through structured experimentation. This recommendation supports the use of sandboxes under ethical and legal safeguards (OECD, 2022).
Privacy-by-Design (PbD)/Data Protection-by-Design (D-PbD):
PbD: First articulated by Dr. Ann Cavoukian in the 1990s and later formalized via the “7 Foundational Principles,” PbD mandates embedding privacy into every layer of system architecture. It begins with default settings to lifecycle controls, while anticipating and preventing privacy intrusions at the design stage (Bu et al., 2020).
DPbD: In alignment with PbD philosophy, the EU GDPR’s Article 25 mandates the implementation of DPbD and data protection-by-default, which legally requires data controllers to implement appropriate technical and organizational measures. These measures include data minimization, pseudonymization, and access controls at both the planning and operational stages of data processing (Intersoft Consulting, 2025d).
Operationalizing in Federated Learning: For FL, this means data privacy and data protection cannot be afterthoughts. Strategies such as localized data processing, encrypted communications, strict role-based access, and synchronized auditability must be integrated from the outset and not retrofitted. Embedding these safeguards ensures that federated systems comply with both the spirit of Cavoukian’s PbD and the legal requirements of the EU GDPR’s Article 25, thereby enhancing trust and reducing regulatory risk.
Table 5 helps readers understand how legal and regulatory frameworks have evolved chronologically. It also discusses how these changes affect FL adoption.
Table 5: Key Regulatory Milestones Relevant to Federated Learning in Health Surveillance
Date | Milestone / Regulation | Region / Body | Relevance to FL |
June 2022 | DARWIN EU Launch | European Medicines Agency (EMA) | Introduced FL-based real-world data network for drug safety with local data retention. |
Feb 2023 | EHDS Political Agreement Finalized | European Union | Codifies use of secure, federated infrastructures for secondary health data use. |
March 26, 2025 | EHDS Enforcement Date | European Union | FL infrastructure becomes legally enforceable for EU cross-border data processing. |
June 1, 2024 | IHR Amendments Adopted at 77th WHA | World Health Organization | Introduced “pandemic emergency” designation and emphasized real-time digital surveillance and interoperability—aligns with FL design. |
Sept 19, 2025 | IHR Amendments Enter into Force (unless reserved) | World Health Organization | Federated systems can help fulfill obligations under amended IHR core capacity and data exchange mandates. |
Policy Integration and Surveillance Readiness: To maximize FL’s public health potential, national and regional authorities must formally embed it into surveillance strategies. This includes readiness planning and piloting federated architecture.
Global Policy Alignment with International Health Regulations (IHR): The World Health Organization’s IHR, initially adopted in 2005 and most recently amended in June 2024, requires countries to maintain core surveillance and reporting capacities. These 2024 amendments reinforce data interoperability, equity, and real-time digital exchange, creating strategic alignment with FL architectures. (WHO, 2024).
Federated Surveillance Pilots: Agencies such as the UK’s NHS have already tested FL in real-world healthcare settings, including the COVID-19 screening pilot conducted across multiple hospitals in the UK. It demonstrated FL’s technical robustness and privacy preservation in live operational environments (Soltan et al., 2024).
Traceable FL in Surveillance Plans: As FL moves beyond pilots, integrating auditability and observability into national surveillance systems becomes critical. The 2024 NHS pilot demonstrated traceable FL design using edge computing, site-level logging, and privacy-preserving aggregation. Features that align with surveillance goals, like accountability and infrastructure resilience, even if not yet reflected in formal preparedness frameworks (Soltan et al., 2024).
Technical Safeguards: Data privacy, reliability, and data security must be designed into FL systems from the outset. Recent studies confirm that multi-layered techniques outperform single safeguards in protecting sensitive health data.
Blockchain Logging & Watermarking: In 2025, researchers introduced a blockchain-enhanced reversible watermarking framework to support end-to-end traceability in FL systems. This method embeds watermarks in medical image updates and logs them into a private blockchain ledger, enabling audit-ready verification of data origin and integrity (Bellafqira et al., 2025).
Homomorphic Encryption + Compression: Combining homomorphic encryption (HE) with model compression techniques has been validated in several federated healthcare deployments. It has significantly reduced communication overhead while protecting data from leakage (Korkmaz & Rao, 2025). For example, FedML‑HE applies selective homomorphic encryption over critical model parameters and compression, demonstrating efficient and privacy-preserving training in real-world settings (Jin et al., 2024).
Secure Multi-Party Computation (SMPC) + Differential Privacy (DP): Combining SMPC with DP enables FL in highly sensitive health settings. A 2025 Scientific Reports study showed that FL with DP achieved 96.1% accuracy on the Breast Cancer Wisconsin Diagnostic dataset at a privacy budget of ε ≈ 1.9, demonstrating strong protection without sacrificing performance (Shubhi et al., 2025).
Conclusion: Charting the Future of Federated Learning in Public Health Surveillance
FL sits at the intersection of AI innovation and ethical responsibility. It offers a promising path to enhance public health surveillance while preserving data privacy and data protection. Its success, however, relies not only on technological capabilities but on how it is embedded within legal, governance, and policy frameworks.
Real-world deployments, such as the NHS COVID-19 federated screening pilot, demonstrate that FL can deliver high-performance predictive models without centralizing raw data. It aligns with core data protection principles, such as data minimization and purpose limitation, as outlined in the EU GDPR and similar laws. However, FL systems must be designed with DPbD and PbD, embedded accountability, and legal interoperability to avoid legal misalignment or operational vulnerabilities.
The EHDS, effective March 26, 2025, establishes interoperable and secure frameworks for the cross-border use of health data within federated infrastructures. In parallel, DARWIN EU, led by the EMA since 2022, utilizes federated models to monitor real-world drug safety, employing local data retention, standardized data models, and aggregated outputs. Global replication of these models will require nations to modernize data-sharing laws, audit requirements, and consent protocols to support traceable and decentralized analytics.
Multidisciplinary coordination is essential. The decentralized nature of FL necessitates collaboration among technologists, legal experts, and public health authorities to align system architecture with legal regimes such as China’s PIPL, the EU GDPR, the U.S. HIPAA, and national surveillance statutes. Auditability and security measures must satisfy compliance standards while also supporting public health objectives, ensuring data usability and regulatory integrity across jurisdictions.
Policy integration must anchor FL in the preparedness strategy. Embedding FL in national surveillance plans, such as Canada’s PHAC Blueprint or the WHO’s IHR capabilities. This integration ensures alignment with long-term preparedness and digital transformation efforts. Federated models should be linked to performance audits, equity assessments, and public trust evaluations as part of a broader system readiness assessment.
Looking Ahead: Future Directions for FL in Global Surveillance: As FL continues to mature, its role in public health surveillance is expected to expand significantly beyond early pilot programs and regulatory experimentation. The years beyond 2025 will usher in a new era where FL is not only privacy-preserving but also technically resilient, ethically accountable, and legally interoperable by design. Anticipating these developments is crucial for AI developers, data privacy and data protection professionals, public health authorities, and policymakers who seek to stay ahead of both innovation and regulation. Key trajectories include:
Adaptive and uncertainty-aware FL models: Next-gen FL systems will self-adjust to population heterogeneity, incorporate uncertainty quantification, and improve predictive resilience in evolving epidemiological landscapes.
Broader Deployment with Secure Auditability: FL systems will increasingly be integrated into global healthcare infrastructure, supported by secure, tamper-evident audit trails to enhance legal and regulatory trust.
Fusion with Confidential Computing and Data Clean Rooms: Hybrid FL architectures will combine federated techniques with confidential computing and policy-aware data clean rooms, thereby enhancing collaborative analytics while maintaining data integrity, data privacy, and data protection.
🚀 Call to Action: FL has the potential to transform public health by enabling secure, real-time, cross-border surveillance. It requires stakeholders to coalesce around a shared, structured roadmap. Key steps include:
Governance: Embed FL into F-PIAs, surveillance regulations, and institutional policies, addressing role definition, liability allocation, and verifiable audit trails.
Legal Enablement: Support experimentation through federated sandboxes; align data-use terms, retention periods, and revocation procedures using frameworks such as the OECD’s Health Data Governance Recommendation and PHAC’s Blueprint.
Policy Integration: Institutionalize FL in national surveillance strategies and pilot deployments addressing high-priority threats, such as antimicrobial resistance or tracking respiratory viruses.
Technical Integrity: Deploy layered safeguards such as SMPC, DP, HE, and blockchain logging to protect confidentiality and ensure traceability.
Final Thought: FL is more than a novel technology; it is a model for a decentralized, transparent, and legally coherent future of public health surveillance. The next chapter in global health surveillance can be federated if we design it that way.
Key Questions for Stakeholders
As federated learning gains traction in public health surveillance, stakeholders must navigate new legal, technical, and operational responsibilities. Table 6 outlines key governance questions that different stakeholder groups, ranging from regulators to node-level institutions, should address when considering FL as part of a national or cross-border health surveillance strategy. These questions help ensure alignment with accountability, data privacy, data protection, and system integrity requirements.
Table 6: Key Governance Questions for Stakeholders Considering Using FL As Part of a Global Health Surveillance Strategy
Stakeholder Group | Key Questions | Governance Domain |
FL Platform Providers | How are updates secured and versioned? Is update provenance traceable across nodes? | Security, Auditability |
Node-Level Institutions | Who controls model parameters locally? Are participants legally considered controllers or processors? | Role Clarity, Legal Classification |
Policy-Makers | Are FL pilots governed under sandbox protections? Can results be validated externally? | Transparency, Oversight |
Privacy Regulators | How is consent managed across jurisdictions? Is metadata leakage audited? | Consent Management, Data Governance |
Public Health Authorities | Who owns the surveillance model? Who is liable for harm resulting from a faulty prediction? | Accountability, Legal Liability |
Public Health Authorities | Who owns the surveillance model? Who is liable for harm resulting from a faulty prediction? | Accountability, Legal Liability |
Privacy Regulators | How is consent managed across jurisdictions? Is metadata leakage audited? | Consent Management, Data Governance |
FL Platform Providers | How are updates secured and versioned? Is update provenance traceable across nodes? | Security, Auditability |
Node-Level Institutions | Who controls model parameters locally? Are participants legally considered controllers or processors? | Role Clarity, Legal Classification |
Policy-Makers | Are FL pilots governed under sandbox protections? Can results be validated externally? | Transparency, Oversight |
References
Aldrees, A., Dutta, A.K., Emara, A., & Anjum, M. (2025). Enhancing dynamic resource management in decentralized federated learning for collaborative edge internet of things. EURASIP Journal on Wireless Communications and Networking. http://dx.doi.org/10.1186/s13638-025-02459-8
Almutairi, S., & Barnawi, A. (2024, December). A comprehensive analysis of model poisoning attacks in federated learning for autonomous vehicles: A benchmark study. Results in Engineering, 24, 103295. https://doi.org/10.1016/j.rineng.2024.103295
Bellafqira, R., Coatrieux, G., & Berton, C. (2025, April). A blockchain-enhanced reversible watermarking framework for end-to-end data traceability in federated learning systems. 9th International Conference on Cryptography, Security, and Privacy. https://www.researchgate.net/publication/392326845_A_Blockchain-Enhanced_Reversible_Watermarking_Framework_for_End-to-End_Data_Traceability_in_Federated_Learning_Systems/citations
Bonta, R. (2024, March 13). California Consumer Privacy Act (CCPA). Office of the Attorney General. https://oag.ca.gov/privacy/ccpa
Bu, F., Wang, N., Jiang, B., & Liang, H. (2020, August). “Privacy-by-Design” implementation: Information system engineers’ perspective. International Journal of Information Management, 53(102124). https://doi.org/10.1016/j.ijinfomgt.2020.102124
Clark, V. (2023, May 24). Successfully navigating the innovation pathway for medical devices provided by China’s NPMA. LinkedIn. https://www.linkedin.com/pulse/successfully-navigating-innovation-pathway-medical-devices-clark/
Cooper, D., Ryoko, M., & Oberschelp de Meneses, A.S. (2025, March 24). Japan plans to adopt AI-friendly legislation. Covington. https://www.insideprivacy.com/international/japans-plans-to-adopt-ai-friendly-legislation/
Data Analysis and Real-World Interrogation Network EU (DARWIN EU). (2025). European Medicines Agency or European Medicines Regulatory Network.
Dayan, I., Roth, H.R., Zhong, A., Harouni, A., Gentili, A., Abidin, A.Z., Liu, A., Costa, A.B., Wood, B.J., Tsai, C.S., Wang, C.H., Hsu, C.N., Lee, C.K., Ruan, P., Xu, D., Wu, D., Huang, E., Kitamura, F.C., Lacey, G., De Antonio Corradi, G.C. …Li, Q. (2021, September 15). Federated learning for predicting clinical outcomes in patients with COVID-19. Nature Medicine, 27, 1735-1743. https://doi.org/10.1038/s41591-021-01506-3
DLA Piper. (2025, January 20). Data protection in China. https://www.dlapiperdataprotection.com/
European Commission. (2025). European Health Data Space Regulation (EHDS).
https://health.ec.europa.eu/ehealth-digital-health-and-care/european-health-data-space_en
European Data Protection Supervisor. (2025, June 10). TechDispatch #1/2025 – federated learning. https://www.edps.europa.eu/data-protection/our-work/publications/techdispatch/2025-06-10-techdispatch-12025-federated-learning_en
European Medicines Agency. (2025, March 28). DARWIN EU. https://www.darwin-eu.org/
European Medicines Agency. (2021, May 26). Medicine device regulation comes into application. https://www.ema.europa.eu/en/news/medical-device-regulation-comes-application
Expert Advisory Group. (2021, November). Pan-Canadian health data strategy: Building Canada’s health data foundation. https://www.canada.ca/content/dam/phac-aspc/documents/corporate/mandate/about-agency/external-advisory-bodies/list/pan-canadian-health-data-strategy-reports-summaries/expert-advisory-group-report-02-building-canada-health-data-foundation/expert-advisory-group-report-02-building-canada-health-data-foundation.pdf
Future of Privacy Forum. (2021, April 13). A new era for Japanese data protection: 2020 amendments to the APPI. https://fpf.org/blog/a-new-era-for-japanese-data-protection-2020-amendments-to-the-appi/
Gardhouse, K. (2025, February 12). Japan’s health data anonymization act: Enabling large-scale health research. PrivateAI. https://www.private-ai.com/en/2025/02/12/japan-health-data-anonymization/
Geiping, J., Bauermeistger, H., Droge, H., & Moeller, M. (2020, September 11). Inverting gradients: How easy is it to break privacy in federated learning? arXiv. https://arxiv.org/abs/2003.14053
Government of Canada. (2025). Global public health intelligence network. Public Health Agency of Canada. https://gphin.canada.ca/cepr/listarticles.jsp?language=en_CA
Government of Canada. (2025, February). Vision 2030: Moving data to public health action. Public Health Agency of Canada. https://www.canada.ca/en/public-health/services/publications/science-research-data/vision-2030-moving-data-action.html
Government of Canada. (2016). Blueprint for a federated system for public health surveillance in Canada: Vision and action plan. Pan-Canadian Public Health Network – Public Health Agency of Canada. https://publications.gc.ca/site/eng/9.814702/publication.html
Gu, M., Naraparaju, R., & Zhao, D. (2024, March 3). Enhancing data provenance and model transparency in federated learning systems: A database approach. arXiv. https://arxiv.org/html/2403.01451v1
Han, Y., Ceross, A., & Bergmann, J. (2024, July 29). Regulatory frameworks for AI-enabled medical device software in China: Comparative analysis and review of implications for global manufacturer. JMIR Publications, 23. https://ai.jmir.org/2024/1/e46871
HealthMap. (2025). The disease daily. https://www.healthmap.org/en/
Henning, K.J. (2004, September 24). Overview of syndromic surveillance: What is syndromic surveillance? U.S. Centers for Disease Control and Prevention. https://www.cdc.gov/mmwr/preview/mmwrhtml/su5301a3.htm
Hosseini, S.M., Sikaroudi, M., Babaei, M., & Tizhoosh, H.R. (2022, August 22). Cluster-based secure multi-party computation in federal learning for histopathology images. arXiv. https://doi.org/10.48550/arXiv.2208.10919
IEEE. Zhu, L. et al. (2019). Deep Leakage from Gradients.
IAPP. Brazil’s LGPD Overview.
https://iapp.org/resources/article/brazils-general-data-protection-law-lgpd/
Information and Privacy Commissioner of Ontario. (2009, January). The new federated privacy impact assessment (F-PIA): Building privacy and trust-enabled federation. https://www.ipc.on.ca/en/media/1528/download?attachment
Innovative Health Initiative. (2023). Integration of heterogeneous data and evidence towards regulatory and HTA acceptance. https://www.ihi.europa.eu/projects-results/project-factsheets/iderha
Intersoft Consulting. (2025a). Art. 5 GDPR: Principles relating to processing of personal data. https://gdpr-info.eu/art-5-gdpr/
Intersoft Consulting. (2025b). Art. 30 GDPR: Records of processing activities. https://gdpr-info.eu/art-30-gdpr/
Intersoft Consulting. (2025c). Art. 17 GDPR: Right to erasure (right to be forgotten’). https://gdpr-info.eu/art-17-gdpr/
Intersoft Consulting. (2025d). Art. 25 GDPR: Data protection by design and by default. https://gdpr-info.eu/art-25-gdpr/
Korkmaz, A., & Rao, P. (2025, June 5). A selective homomorphic encryption approach for faster privacy-preserving federated learning. arXiv. https://doi.org/10.48550/arXiv.2501.12911
Jiang, Y., Liu, X., & Yan, Q. (2025, February 5). Vanishing privacy: Fast gradient leakage threat to federated learning. OpenReview.net. https://openreview.net/forum?id=LJULZNlW5d
Jin, W., Yao, Y., Han, S., Gu, J., Joe-Wong, C., Ravi, S., Avestimehr, A., & He, C. (2024, June 17). FedML-HE: An efficient homomorphic-encryption-based privacy-preserving federated learning system. arXiv. https://doi.org/10.48550/arXiv.2303.10837
Jorstad, K.T. (2020, December). Intersection of artificial intelligence and medicine: Tort liability in the technological age. Journal of Medical Artificial Intelligence, 3, 1-28. https://jmai.amegroups.org/article/view/5938/pdf
Kairouz, P., McMahan, B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., D’Oliveira, G.L., Eichner, H., El Rouayheb, D.E., Evans, D., Gardner, J., Garrett, Z., Gascon, A., Ghazi, B., Gibbons, P.B., Gruteser, M….Zhao, S. (2021). Advances and open problems in federated learning. NOW Publishers, 14(1-2), 1-210. https://www.nowpublishers.com/article/Details/MAL-083
40. Kaissis, G.A., Makowski, M.R., Ruckert, D., & Braren, R.F. (2020, June 8). Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine, 2, 305-311. https://doi.org/10.1038/s42256-020-0186-1
41. Kasarla, R. (2025, May 30). A Federated Learning Approach for Predicting Resource Allocation in Multi-Access Edge Computing (MEC). International Journal of Computer Trends and Technology, 73(5), 114–122. https://doi.org/10.14445/22312803/IJCTT-V73I5P114
42. Keohane, D. (2025, May 15). Japan weighs privacy rules against treatments in healthcare. Financial Times. https://www.ft.com/content/f4e81b4c-d0b6-40ef-89c5-3f69a6a26574
43. Lieftink, N., Dos S Ribiero, C., Kroon, M. Haringhuizen, G.B., Wong, A., & Van de Burgwal, L.H.M. (2024, September 19). The potential of federated learning for public health purposes: A qualitative analysis of GDPR compliance, Europe, 2021. Euro Surveill, 29(38), pii=2300695. https://doi.org/10.2807/1560-7917
44. Li, A., Liu, R., Hu, M., Tuan, L.A., & Yu, H. (2023, February 27). Towards interpretable federated learning. arXiv. https://doi.org/10.48550/arXiv.2302.13473
Li, Q., Yu, W., Xia, Y., & Pang, J. (2025, March 10). From centralized to decentralized federal learning: Theoretical insights, privacy preservation, and robustness challenges. arXiv. https://doi.org/10.48550/arXiv.2503.07505
Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. arXiv.
Liu, Y., Chen, C., Lyu, L., Wu, F., Wu, S., & Chen, G. (2023, June 4). Byzantine-robust learning on heterogeneous data via gradient splitting. arXiv. https://arxiv.org/abs/2302.06079
Liu, X., Li, M., Yu, G., Wang, X., Ni, W., Li, L., Peng, H., & Liu, R.P. (2024, August 14). BlockFUL: Enabling unlearned in blockchained federated learning. arXiv. https://doi.org/10.48550/arXiv.2402.16294
MacMahan, B., Moore, E., Ramage, D., Hampson, S., & Aguera y Arcas, B. (2017). Communication-efficient learning of deep networks from decentralized data. 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 54, 1273-1282. https://proceedings.mlr.press/v54/mcmahan17a.html
Mia, D. (2024, May). Federated learning in multi-cloud environments: A privacy-preserving approach to cross-border data compliance. ResearchGate. https://www.researchgate.net/publication/392628794_Federated_Learning_in_Multi-Cloud_Environments_A_Privacy-_Preserving_Approach_to_Cross-Border_Data_Compliance/citations
Mohri, M., Sivek, G., & Suresh, A.T. (2019, February 1). Agnostic federated learning. arXiv. https://doi.org/10.48550/arXiv.1902.00146
Mothukuri, V., Parizi, R.M., Puriyeh, S., Huang, Y., Dehghantanham, A., & Srivastava, G. (2021, February). A survey on security and privacy of federated learning. Future Generation Computer Systems, 115, 1-880. https://doi.org/10.1016/j.future.2020.10.007
Mukhtiar, M., Mahmood, A., & Sheng, Q.Z. (2025, April). Fairness in federated learning: Trends, challenges, and opportunities. Advanced Intelligence Systems. http://dx.doi.org/10.1002/aisy.202400836
National Archives. (2025, May 5). Title 45, Subtitle A, Subchapter C, Part 164, Subpart C, 164.312: Technical safeguards. Code of Federal Regulations. https://www.ecfr.gov/current/title-45/subtitle-A/subchapter-C/part-164/subpart-C/section-164.312
National Cancer Institute. (2025). The Cancer Genome Atlas Program (TCGA). U.S. Department of Health and Human Services. https://www.cancer.gov/ccg/research/genome-sequencing/tcga
NHS England. (2025a). NHS FDP privacy notice. https://www.england.nhs.uk/digitaltechnology/nhs-federated-data-platform/security-privacy/nhs-fdp-privacy-notice/
NHS England. (2025b). Federated data platform: Information governance framework. https://www.england.nhs.uk/long-read/federated-data-platform-information-governance-framework/
Ning, W., Zhu, Y., Song, C., Li, H., Zhu, L., Xie, J., Chen, T., Xu, T., Xu, X., & Gao, J. (2024, October 16). Blockchain-based federal learning: A survey and new perspectives. Applied Sciences, 14(9459), 1-35. https://doi.org/10.3390/app14209459
Nishith Desai Associates. (2025, January 7). Legal update and technology law analysis - India’s new data protection regime, one step closer: Draft compliance rules issued. https://www.nishithdesai.com/fileadmin/user_upload/Html/Hotline/Technology_Law_Analysis_Jan0625-M.html
Okuno, M.J., & Okuno, H.G. (2025, March 26). Legal frameworks for AI service business participants: A comparative analysis of liability protection across jurisdictions. AI & Society. https://doi.org/10.1007/s00146-025-02288-9
Organization for Economic Cooperation and Development. (2022). Health: Policy area. https://www.oecd.org/en/topics/health.html
Organization for Economic Cooperation and Development. (2016, December). Recommendation of the council on health data governance. OECD Legal Instruments. https://legalinstruments.oecd.org/public/doc/348/348.en.pdf
Pan-Canadian Public Health Network. (2025). Blueprint for a federated system for public health surveillance in Canada. https://www.phn-rsp.ca/en/reports-publications/blueprint-federated-system-public-health-surveillance-canada.html
Rieke, N., Hancox, J., Li, W., Melletari, F., Holger, R.R., Albarqouni, S., Bakas, S., Galtier, M.N., Landman, B.A. Maier-Hein, K., Ourselin, S., Sheller, M., Summers, R.M., Trask, A., Xu, D., Baust, M., & Cardoso, M.J. (2020a). The future of digital health with federal learning. npj Digital Medicine, 3(119), 1-7. https://doi.org/10.1038/s41746-020-00323-1
Ribeiro, R.H., Jacobs, A.S., Zembruzki, L., Pariozotto, R., Scheid, E.J., Schaeffer-Filho, A.E., Granville, L.Z., & Stiller, B. (2022, September). A deterministic approach for extracting network security intents. Computer Networks, 214. https://doi.org/10.1016/j.comnet.2022.109109
Sadilek, A., Liu, L., Nguyen, D., Kamruzzamam, M., Serghiou, S., Rader, B., Ingerman, A., Mellem, S., Kairouz, P., Nsoesie, E.O., MacFarlane, J., Vullikanti, A., Marathe, M., Eastham, P., Brownstein, J.S., Agera y Arcas, B., Howell, M.D., & Hernandez, J. (2021, September 7). Privacy-first health research with federated learning. npj Digital Medicine, 4(132). https://doi.org/10.1038/s41746-021-00489-2
Sah, D.K., & Hossein, F. (2025, June 1). Federated learning at the edge in industrial internet of things: A review. Sustainable Computing: Informatics and Systems. https://doi.org/10.1016/j.suscom.2025.101087
Sanchez-Sanchez, P.M., Huertas-Celdran, A.H., Xie, N., Bovet, G., Martinez-Perez, G., & Stillar, B. (2024, March). FederatedTrust: A solution for trustworthy federated learning. Future Generation Computer Systems, 152, 83-88. https://doi.org/10.1016/j.future.2023.10.013
Secure Privacy. (2025, May 25). Federated learning’s consent crisis: Building privacy-preserving AI without sacrificing individual choice. https://secureprivacy.ai/blog/consent-orchestration-federated-learning
Shi, Y., Reshniak, V., Kotevska, O., Singh, A., & Raskar, R. (2024, May 16). Dealing doubt: Unveiling threat models in gradient inversion attacks under federated learning – a survey and taxonomy. arXiv. https://arxiv.org/html/2405.10376v1
Shubhi, S., Rajkumar, S., Sinha, A., Esha, M., Elango, K., & Sampath, V. (2025, April 16). Federated learning with differential privacy for breast cancer diagnosis enabling secure data sharing and model integrity. Scientific Reports, 15(13061). https://doi.org/10.1038/s41598-025-95858-2
Soltan, A.A., Thakur, A., Yang, J., Chauhan, A., D’Cruz, L.G., Dickson, P., Soltan, M.A., Thickett, D.R., Eyre, D.W., Zhu, T., & Clifton, D.A. (2024, February). A scalable federated learning solution for secondary care using low-cost microcomputing: Privacy-preserving development and evaluation of a COVID-19 screening test in UK hospitals. Lancet Digital Health, 6(2), E93-E104. https://doi.org/10.1016/s2589-7500(23)00226-1
UK NHS AI Lab. Using federated learning in genomics.
https://www.ndm.ox.ac.uk/ai-in-healthcare/using-federated-learning-in-genomics
Wang, Z., Chang, Z., Hu, J., Pang, X., Du, J., Chen, Y., & Ren, K. (2024, June 22). Breaking secure aggregation: Label leakage from aggregated gradients in federated learning. arXiv. https://arxiv.org/html/2406.15731v1
Wang, L., & Wang, L. (2025, February 24). Forgetting any data at any time: A theoretically certified unlearning framework for vertical federated learning. arXiv. https://doi.org/10.48550/arXiv.2502.17081
Wibmann, T., & Milius, S. (2024, May 21). Initial algebras unchained – A novel initial algebra construction formalized in Adga. arXiv. https://doi.org/10.48550/arXiv.2405.09504
Woisetschläger, H., Erben, A., Marino, B., Wang, S., Lane, N. D., Mayer, R., & Jacobsen, H.-A. (2024a, February 2024). Federated learning priorities under the European Union Artificial Intelligence Act. arXiv. https://doi.org/10.48550/arXiv.2402.05968
Woisetchalager, H., Mertel, S., Kronke, C., Mayer, R., & Jacobsen, A. (2024b, July 12). Federated learning and AI regulation in the European Union: Who is responsible? –An interdisciplinary analysis. arXiv. https://doi.org/10.48550/arXiv.2407.08105
World Health Organization. (2017, June 19). WHO guidelines on ethical issues in public health surveillance. https://www.who.int/publications/i/item/9789241512657
World Health Organization. (2024, June 27). The new amendments to the International Health Regulations (77th WHA, 2024). https://cil.nus.edu.sg/wp-content/uploads/2024/06/The-New-IHR-Amendments-2024_27.06.2024-FINAL.pdf
Wu, J., Jin, J., & Wu, C. (2024, March 19). Challenges and countermeasures of federated learning data poisoning attack situation prediction. Mathematics. https://doi.org/10.3390/math12060901
Zhang, J., Zhou, W., & Ujcich, B.E. (2024, December 20). Provenance-enabled explainable AI. Proceedings of the ACM on Management of Data, 2(6), 1-27. https://doi.org/10.1145/3698826
Zhu, L., Liu, Z., & Han, S. (2019, June 21). Deep leakage from gradients. arXiv. https://doi.org/10.48550/arXiv.1906.08935
Appendix I: Glossary of Key Terms
🏛️ 1. Governance and Infrastructure
Accountability: The obligation to ensure traceability, assign legal responsibility, and document actions in FL systems.
Audit Trails / Provenance: System-generated logs that track data flows, model updates, and participation—critical for accountability and compliance.
FederatedTrust: A governance framework that scores FL systems based on six pillars: privacy, robustness, fairness, explainability, accountability, and federation.
Federated Privacy Impact Assessment (F-PIA): A privacy risk evaluation framework specific to FL architectures, where participants jointly assess risks and mitigation strategies.
Regulatory Sandboxes: Controlled environments where new technologies like FL are tested under regulatory oversight with temporary legal flexibility.
⚖️ 2. Legal and Regulatory Terms
Data Controller vs. Data ProcessorRoles under EU GDPR and similar data protection laws and regulations: Data controllers determine the purpose and means of processing, while data processors act on their behalf.
Data Minimization: An EU GDPR principle requiring that data collected be limited to what is strictly necessary for the intended purpose.
Data Protection-by-Design (DPbD): A legal obligation under EU GDPR Article 25 requires organizations to integrate data protection into system architecture from the outset.
Deemed Consent: A legal basis for data use in emergencies or public health scenarios, such as under India’s DPDPA.
International Health Regulations (IHR): A legally binding WHO framework that requires countries to build surveillance capacities and ensure real-time data coordination.
Localization Rules: Laws requiring personal data—especially sensitive categories like health—to be stored and processed within national borders.
Privacy Impact Assessment (PIA): A structured evaluation of data protection risks and safeguards before launching a data processing system.
🌍 3. Public Health and Surveillance Terms
Federated Learning (FL): A decentralized machine learning technique where models are trained locally, and only updates are shared, preserving data sovereignty.
Federated Surveillance Pilot Programs: Experimental FL deployments in public health (e.g., NHS COVID-19 pilot) designed to assess feasibility, equity, and compliance.
Node-Level Institution: An individual hospital, clinic, or research lab participating in an FL network with localized data and computing capacity.
Pandemic Emergency: A new designation added in the 2024 IHR amendments to streamline coordinated international response to global health threats.
Public Health Surveillance: The systematic collection, analysis, and dissemination of health data to inform policy, detect threats, and guide interventions.
🧠 4. Technical and Security Concepts
Blockchain Logging: A method for recording model updates or transactions in a tamper-evident, decentralized ledger to ensure traceability in FL.
Differential Privacy (DP): A mathematical framework that introduces statistical noise to outputs (e.g., gradients) to protect individual data contributions in aggregated datasets.
Gradient Leakage (DLG, FGL): Adversarial attacks that reconstruct private input data from shared gradients or model updates.
Homomorphic Encryption (HE): A cryptographic technique allowing computations to be performed on encrypted data without decryption, securing model updates in transit.
Label Inference (LI): An attack where adversaries infer training labels from model updates, even without accessing raw input data.
Metadata Exposure: Leakage of secondary signals (e.g., update frequency, round participation) that can reveal client identities or training characteristics.
Model Inversion: A privacy threat where attackers reconstruct training inputs using access to the model's outputs or parameters.
Secure Aggregation: A cryptographic protocol that enables federated learning clients to securely combine model updates without revealing individual contributions.
Secure Multiparty Computation (SMPC): A cryptographic method allowing multiple parties to compute a function collaboratively without sharing their raw inputs.



Comments