Federated Learning Approaches for Privacy-Preserving Big Data Analytics

Yanzhi Kou

doi:10.54097/d52m6j10

Authors

Yanzhi Kou

DOI:

https://doi.org/10.54097/d52m6j10

Keywords:

Federated Learning, Privacy-Preserving, Big Data Analytics, Differential Privacy, Secure Aggregation

Abstract

The blistering growth of big data analytics in industries has transformed how decision-making is done and increased the risk to privacy of centralized machine learning, in which aggregation of sensitive raw data can put information at risk of breaches and inference attacks. The Federated Learning (FL) system provides a decentralized framework that allows performing collaborative model training, but retains data centred on the client devices or institutional servers, thus fulfilling the stringent regulatory standard of regulations like GDPR and HIPAA. The current review is a synthesis of 127 recent papers (20232025) that assess five main privacy-sensible FL methods, namely Standard Federated Averaging (FedAvg), Differential Privacy-enhanced FL (DP-FL, Secure Aggregation, Homomorphic Encryption-based FL (HE-FL) and Hybrid FL frameworks. One of them, DP-FL, is the most widely used method (about 40% of deployments) and it offers a good privacy-utility trade-off with common accuracy degradations of 1-5%. Hybrid designs, particularly those combining differential privacy and secure aggregation, provide defense-in-depth protection, little performance loss (1-4%), and most rapidly increasing deployment rates (34%/year), especially in regulated markets. FL has a high level of practical impact in fundamental areas: healthcare (35% of the applications, e.g., multi-institutional medical imaging and disease prediction), finance (28%, e.g., fraud detection and risk assessment), IoT/smart cities (20%, e.g., traffic optimization and predictive maintenance), and mobile/enterprise systems. The longstanding issues, such as non-IID data heterogeneity, communication overhead, security threats (poisoning and inference attacks), system heterogeneity, and scalability, are addressed with inventions, such as adaptive aggregation, gradient compression, hierarchical architectures, and Byzantine-robust mechanisms. In recent developments to 2026, the focus of development is toward personalized FL, greater adversarial robustness and connection with large language models, making hybrid and personalized FL the best approach to creating secure, scalable, privacy-preserving analytics in the increasingly decentralized world of big data.

Downloads

Download data is not yet available.

References

[1] M. T. Hasan, Sai, and P. Kudapa, “Data Privacy-Aware Machine Learning And Federated Learning: A Framework For Data Security,” American Journal of Interdisciplinary Studies, vol. 2, no. 03, pp. 01–34, Sep. 2021, doi: 10.63125/VJ1HEM03. DOI: https://doi.org/10.63125/vj1hem03

[2] A. Aljohani, O. Rana, and C. Perera, “Self-adaptive Federated Learning in Internet of Things Systems: A Review,” ACM Comput Surv, vol. 57, no. 10, May 2025, doi: 10.1145/3725527;WGROUP:STRING:ACM. DOI: https://doi.org/10.1145/3725527

[3] S. R. Chalamala, N. K. Kummari, A. K. Singh, A. Saibewar, and K. M. Chalavadi, “Federated learning to comply with data protection regulations,” CSI Transactions on ICT 2022 10:1, vol. 10, no. 1, pp. 47–60, Mar. 2022, doi: 10.1007/S40012-022-00351-0. DOI: https://doi.org/10.1007/s40012-022-00351-0

[4] X. Zhang, W. Yin, M. Hong, and T. Chen, “Hybrid Federated Learning: Algorithms and Implementation,” Dec. 2020, Accessed: Jan. 15, 2026. [Online]. Available: https://arxiv.org/pdf/2012.12420

[5] J. Wu et al., “Hierarchical personalized federated learning for user modeling,” The Web Conference 2021 - Proceedings of the World Wide Web Conference, WWW 2021, vol. 21, pp. 957–968, Jun. 2021, doi: 10.1145/3442381.3449926;TOPIC:TOPIC:CONFERENCE-COLLECTIONS. DOI: https://doi.org/10.1145/3442381.3449926

[6] T. R. Gadekallu et al., “Federated Learning for Big Data: A Survey on Opportunities, Applications, and Future Directions,” Oct. 2021, Accessed: Jan. 15, 2026. [Online]. Available: https://arxiv.org/pdf/2110.04160

[7] R. Aziz et al., “Exploring Homomorphic Encryption and Differential Privacy Techniques towards Secure Federated Learning Paradigm,” Future Internet 2023, Vol. 15, vol. 15, no. 9, Sep. 2023, doi: 10.3390/FI15090310. DOI: https://doi.org/10.3390/fi15090310

[8] C. Gilbert and M. Gilbert, "The Effectiveness of Homomorphic Encryption in Protecting Data Privacy," International Journal of Research Publication and Reviews, vol. 5, no. 11, pp. 3235–3256, Nov. 2024, doi: 10.2139/ssrn.5259722. DOI: https://doi.org/10.55248/gengpi.5.1124.3253

[9] H. Zhang, S. Jiang, and S. Xuan, “Decentralized federated learning based on blockchain: concepts, framework, and challenges,” Comput Commun, vol. 216, pp. 140–150, Feb. 2024, doi: 10.1016/J.COMCOM.2023.12.042. DOI: https://doi.org/10.1016/j.comcom.2023.12.042

[10] E. Kuznetsov, Y. Chen, and M. Zhao, “SecureFL: Privacy Preserving Federated Learning with SGX and TrustZone,” 6th ACM/IEEE Symposium on Edge Computing, SEC 2021, pp. 55–67, 2021, doi: 10.1145/3453142.3491287.

Federated Learning Approaches for Privacy-Preserving Big Data Analytics

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Cover

Indexing & Abstracting