Analyzing the Convergence of Federated Learning with Biased Client Participation

Tan, Lei; Hu, Miao; Zhou, Yipeng; Wu, Di

doi:10.1007/978-3-031-46664-9_29

Lei Tan^15,16,
Miao Hu^15,16,
Yipeng Zhou¹⁷ &
…
Di Wu^15,16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14177))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

992 Accesses

Abstract

Federated Learning (FL) is a promising decentralized machine learning framework that enables a massive number of clients (e.g., smartphones) to collaboratively train a global model over the Internet without sacrificing their privacy. Though FL’s efficacy in non-convex problems is proven, its convergence amidst biased client participation lacks theoretical study. In this paper, we analyze the convergence of FedAvg on non-convex problems, which is the most renowned FL algorithm. We assume even data distribution but non-IID among clients, and elucidate the convergence rate of FedAvg in situations characterized by biased client participation. Our analysis reveals that biased client participation can significantly reduce the precision of the FL model. We validate this through trace-driven experiments, demonstrating that unbiased client participation results in 11% to 50% higher test accuracy compared to extremely biased client participation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-Model Federated Learning with Provable Guarantees

Beyond Random Selection: A Perspective from Model Inversion in Personalized Federated Learning

FedSmart: An Auto Updating Federated Learning Optimization Mechanism

References

Abay, A., Zhou, Y., Baracaldo, N., Rajamoni, S., Chuba, E., Ludwig, H.: Mitigating bias in federated learning. arXiv preprint arXiv:2012.02447 (2020). https://doi.org/10.48550/arXiv.2012.02447
Amiri, M.M., Gündüz, D., Kulkarni, S.R., Poor, H.V.: Convergence of federated learning over a noisy downlink. IEEE Trans. Wireless Commun. 21(3), 1422–1437 (2021). https://doi.org/10.1109/TWC.2021.3103874
Article Google Scholar
Balakrishnan, R., Li, T., Zhou, T., Himayat, N., Smith, V., Bilmes, J.: Diverse client selection for federated learning via submodular maximization. In: International Conference on Learning Representations (ICLR) (2021)
Google Scholar
Chen, F., Chen, N., Mao, H., Hu, H.: Assessing four neural networks on handwritten digit recognition dataset (MNIST). arXiv preprint arXiv:1811.08278 (2018). https://doi.org/10.48550/ARXIV.1811.08278
Chilimbi, T., Suzue, Y., Apacible, J., Kalyanaraman, K.: Project adam: building an efficient and scalable deep learning training system. In: Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation (OSDI), pp. 571–582 (2014)
Google Scholar
Cho, Y.J., Wang, J., Joshi, G.: Client selection in federated learning: convergence analysis and power-of-choice selection strategies. arXiv preprint arXiv:2010.01243 (2020). https://doi.org/10.48550/arXiv.2010.01243
Duan, M., Liu, D., Chen, X., Liu, R., Tan, Y., Liang, L.: Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Trans. Parallel Distrib. Syst. 32(1), 59–71 (2020). https://doi.org/10.1109/TPDS.2020.3009406
Article Google Scholar
Haddadpour, F., Mahdavi, M.: On the convergence of local descent methods in federated learning. arXiv preprint arXiv:1910.14425 (2019). https://doi.org/10.48550/arXiv.1910.14425
Kairouz, P., et al.: Advances and open problems in federated learning. Found. Trends® Mach. Learn. 14(1–2), 1–210 (2021)
Article MATH Google Scholar
Khaled, A., Mishchenko, K., Richtárik, P.: First analysis of local GD on heterogeneous data. arXiv preprint arXiv:1909.04715 (2019). https://doi.org/10.48550/ARXIV.1909.04715
Khan, L.U., Saad, W., Han, Z., Hossain, E., Hong, C.S.: Federated learning for internet of things: recent advances, taxonomy, and open challenges. IEEE Commun. Surv. Tutor. (2021). https://doi.org/10.1109/COMST.2021.3090430
Article Google Scholar
Krizhevsky, A.: Learning Multiple Layers of Features From Tiny Images. University of Toronto, Toronto (2012)
Google Scholar
Li, A., Zhang, L., Tan, J., Qin, Y., Wang, J., Li, X.Y.: Sample-level data selection for federated learning. In: IEEE Conference on Computer Communications (INFOCOM), pp. 1–10 (2021). https://doi.org/10.1109/INFOCOM42981.2021.9488723
Li, T., Hu, S., Beirami, A., Smith, V.: Ditto: Fair and robust federated learning through personalization. In: Proceedings of the 38th International Conference on Machine Learning (ICML), pp. 6357–6368. PMLR (2021)
Google Scholar
Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020)
Google Scholar
Li, T., Sanjabi, M., Smith, V.: Fair resource allocation in federated learning. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of Fedavg on non-IID data. In: Eighth International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Liu, R., Cao, Y., Yoshikawa, M., Chen, H.: Fedsel: Federated SGD under local differential privacy with top-k dimension selection. In: DASFAA (2020)
Google Scholar
Ma, J., Xie, M., Long, G.: Personalized federated learning with robust clustering against model poisoning. In: Chen, W., Yao, L., Cai, T., Pan, S., Shen, T., Li, X. (eds.) ADMA 2022. LNCS, vol. 13726, pp. 238–252. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-22137-8_18
Chapter Google Scholar
McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics (AISTATS), pp. 1273–1282 (2017)
Google Scholar
Segarceanu, S., Gavat, I., Suciu, G.: Evaluation of deep learning techniques for acoustic environmental events detection. Romanian J. Technical Sci. Appl. Mech. 66(1), 19–37 (2021)
MathSciNet Google Scholar
Tan, L., et al.: Adafed: optimizing participation-aware federated learning with adaptive aggregation weights. IEEE Trans. Network Sci. Eng. 9(4), 2708–2720 (2022). https://doi.org/10.1109/TNSE.2022.3168969
Article Google Scholar
Xu, J., Glicksberg, B.S., Su, C., Walker, P., Bian, J., Wang, F.: Federated learning for healthcare informatics. J. Healthcare Inform. Res. 5(1), 1–19 (2021)
Article Google Scholar
Yang, H., Fang, M., Liu, J.: Achieving linear speedup with partial worker participation in non-IID federated learning. In: International Conference on Learning Representations (ICLR) (2021)
Google Scholar
Yang, W., et al.: Gain without pain: Offsetting DP-injected Nosies stealthily in cross-device federated learning. IEEE Internet Things J. 9(22), 22147–22157 (2021). https://doi.org/10.1109/JIOT.2021.3102030
Article Google Scholar
Yu, H., Jin, R., Yang, S.: On the linear speedup analysis of communication efficient momentum SGD for distributed non-convex optimization. In: International Conference on Machine Learning (ICML), pp. 7184–7193 (2019)
Google Scholar

Download references

Acknowledgements

This study received support from the National Natural Science Foundation of China through Grants U1911201 and U2001209, the Natural Science Foundation of Guangdong under Grant 2021A1515011369, and the Science and Technology Program of Guangzhou under Grant 2023A04J2029.

Author information

Authors and Affiliations

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
Lei Tan, Miao Hu & Di Wu
Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou, 510006, China
Lei Tan, Miao Hu & Di Wu
School of Computing, Faculty of Science and Engineering, Macquarie University, Sydney, NSW, 2109, Australia
Yipeng Zhou

Authors

Lei Tan
View author publications
You can also search for this author in PubMed Google Scholar
Miao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yipeng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Di Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Di Wu .

Editor information

Editors and Affiliations

Northeastern University, Shenyang, China
Xiaochun Yang
The University of Indonesia, Depok, Indonesia
Heru Suhartanto
Beijing Institute of Technology, Beijing, China
Guoren Wang
Northeastern University, Shenyang, China
Bin Wang
University of Technology Sydney, Sydney, NSW, Australia
Jing Jiang
Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Bing Li
Sun Yat-sen University, Guangzhou, China
Huaijie Zhu
Anhui University, Hefei, China
Ningning Cui

Appendiex

Within this section, we will provide proofs for Lemma 1 and Lemma 2.

1.1 Proof of Lemma 1

For any $t \ge 0 $, there exits a $t - t_0 \le E $, and $ \omega _k^{t_0} = \omega ^{t_0} $ for all $k=1,2,...,N$. Similar to previous work [17], we have

$$\begin{aligned} \, \begin{aligned} \mathbb {E} {\left\| \omega _k^t-\omega ^t \right\| }^2 &= \mathbb {E} {\left\| (\omega _k^t - \omega ^{t_0}) -(\omega ^t - \omega ^{t_0}) \right\| }^2\\ &\le \mathbb {E} {\left\| \omega _k^t - \omega ^{t_0} \right\| }^2\\ &\le \eta _t^2 E \sum _{i=t_0}^t \mathbb {E} \left\| \nabla F_k(\omega _k^i,\xi _k^i)\right\| ^2\\ &\le \eta _t^2 E^2 G^2. \end{aligned} \end{aligned}$$

1.2 Proof of Lemma 2

Since $ n_k = \frac{n}{N} $ for all $ n_k $, $ \sum _{k=1}^{N} M_k^t = mt $, we can derive that

$$\begin{aligned} \, \begin{aligned} \sum _{k\in S_t} p_k^2 = \sum _{k\in S_t} \left( \frac{\frac{n}{N} M_k^t}{ \sum _{k=1}^{N} \frac{n}{N} M_k^t} \right) ^2 = \sum _{k\in S_t} {\left( \frac{M_k^t}{mt} \right) }^2. \end{aligned} \end{aligned}$$

(22)

Utilizing the $\rho $-smoothness property of $F(\omega )$, the subsequent inequality can be derived:

$$\begin{aligned} \, \begin{aligned} \mathbb {E} F(\omega ^{t+1}) \le \mathbb {E} F(\omega ^t) + \mathbb {E} \left\langle \nabla F(\omega ^t), \omega ^{t+1}-\omega ^t \right\rangle +\frac{\rho }{2} \mathbb {E} {\left\| \omega ^{t+1}-\omega ^t \right\| }^2. \end{aligned} \end{aligned}$$

(23)

By applying the fact: $ \mathbb {E} \left\| x \right\| ^2 = \mathbb {E} \left[ \left\| x - \mathbb {E} x \right\| ^2 \right] + \left\| \mathbb {E} x \right\| ^2 $, we can obtain

$$\begin{aligned} \, \begin{aligned} \mathbb {E}& {\left\| \omega ^{t+1}-\omega ^t \right\| }^2 = \eta _t^2 \mathbb {E} {\left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t,\xi _k^t) \right\| }^2\\ &= \eta _t^2 \mathbb {E} {\left\| \frac{N}{m} \sum _{k\in S_t} p_k \left[ \nabla F_k(\omega _k^t,\xi _k^t) - \nabla F_k(\omega _k^t) \right] \right\| }^2 + \eta _t^2 \mathbb {E} {\left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) \right\| }^2. \end{aligned} \end{aligned}$$

(24)

Since each client works in parallel and independently and according to Assumption 3, we have

$$\begin{aligned} \, \begin{aligned} \mathbb {E} {\left\| \omega ^{t+1}-\omega ^t \right\| }^2 &= \frac{\eta _t^2 N^2}{m^2} \sum _{k\in S_t} p_k^2 \mathbb {E} {\left\| \nabla F_k(\omega _k^t,\xi _k^t) - \nabla F_k(\omega _k^t) \right\| }^2\\ &\quad + \eta _t^2 \mathbb {E} {\left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) \right\| }^2\\ &\le \frac{\eta _t^2 N^2 \delta ^2}{m^2} \sum _{k\in S_t} p_k^2 + \eta _t^2 \mathbb {E} {\left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) \right\| }^2. \end{aligned} \end{aligned}$$

(25)

We further note that

$$\begin{aligned} \, \begin{aligned} \mathbb {E}& \left\langle \nabla F(\omega ^t), \omega ^{t+1}-\omega ^t \right\rangle = - \eta _t \mathbb {E} \left\langle \nabla F(\omega ^t), \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega ^t,\xi _k^t) \right\rangle \\ & = - \eta _t \mathbb {E} \left[ \mathbb {E} \left[ \left\langle \nabla F(\omega ^t), \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega ^t,\xi _k^t) \right\rangle \Bigg | \xi ^t \right] \right] \\ & = - \eta _t \mathbb {E} \left\langle \nabla F(\omega ^t), \mathbb {E} \left[ \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega ^t,\xi _k^t) \Big | \xi ^t \right] \right\rangle \\ & = \begin{array}{c} \underbrace{ - \eta _t \mathbb {E} \left\langle \nabla F(\omega ^t), \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega ^t) \right\rangle }\\ A1 \end{array}. \end{aligned} \end{aligned}$$

(26)

Firstly, for bound A1, we can obtain

(27)

Secondly, $\omega ^{t+1} = \frac{N}{m} \sum _{k \in s_t} p_k \omega _{k}^{t+1}$ according to Eq. (4), therefore we can obtain $\nabla F(\omega ^{t+1}) = \frac{N}{m} \sum _{k \in s_{t+1}} p_k \nabla F_k(\omega ^{t+1})$ [24]. For bound A2, we can obtain

$$\begin{aligned} \, \begin{aligned} A2 &= \frac{\eta _t}{2} \mathbb {E} \left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) - \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega ^t) \right\| ^2\\ &= \frac{\eta _t}{2} \mathbb {E} \left\| \frac{N}{m} \sum _{k\in S_t} p_k \left[ \nabla F_k(\omega _k^t) - \nabla F_k(\omega ^t) \right] \right\| ^2. \end{aligned} \end{aligned}$$

(28)

According to the Cauchy-Buniakowsky-Schwarz inequality, we have

$$\begin{aligned} \, \begin{aligned} A2 &\le \frac{\eta _t N^2}{2m^2} \sum _{k\in S_t} p_k^2 \sum _{k\in S_t} \mathbb {E} \left\| \nabla F_k(\omega _k^t) - \nabla F_k(\omega ^t) \right\| ^2. \end{aligned} \end{aligned}$$

(29)

By using Assumption 1, we can obtain

$$\begin{aligned} \, \begin{aligned} A2 & \le \frac{\eta _t \rho ^2 N^2}{2m^2} \sum _{k\in S_t} p_k^2 \sum _{k\in S_t} \mathbb {E} \left\| \omega _k^t - \omega ^t \right\| ^2. \end{aligned} \end{aligned}$$

(30)

By using Lemma 1, we can derive the bound of A2 as

$$\begin{aligned} \, \begin{aligned} A2 & \le \frac{\eta _t^3 \rho ^2 N^2 E^2 G^2}{2m} \sum _{k\in S_t} p_k^2. \end{aligned} \end{aligned}$$

(31)

Upon substituting Eq. (31) into Eq. (27), we arrive at the upper bound for A1 as follows:

$$\begin{aligned} \, \begin{aligned} A1 \le - \frac{\eta _t}{2} \mathbb {E} {\left\| \nabla F(\omega ^t) \right\| }^2 &- \frac{\eta _t}{2} \mathbb {E} \left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) \right\| ^2\\ &\quad + \frac{\eta _t^3 \rho ^2 N^2 E^2 G^2}{2m} \sum _{k\in S_t} p_k^2. \end{aligned} \end{aligned}$$

(32)

By combining the results of Eq. (25), Eq. (26) and Eq. (32), we can obtain

$$\begin{aligned} \, \begin{aligned} F(\omega ^{t+1}) &\le F(\omega ^t) - \frac{\eta _t}{2} \mathbb {E} {\left\| \nabla F(\omega ^t) \right\| }^2\\ &\quad - \frac{\eta _t - \eta _t^2 \rho }{2} \mathbb {E} \left\| \frac{N}{m} \sum _{k\in S_t} p_k \nabla F_k(\omega _k^t) \right\| ^2\\ &\quad + \frac{\eta _t^3 \rho ^2N^2 E^2 G^2}{2m} \sum _{k\in S_t} p_k^2 + \frac{\eta _t^2 \rho N^2 \delta ^2}{2m^2} \sum _{k\in S_t} p_k^2. \end{aligned} \end{aligned}$$

(33)

The conclusion that $ 0 \le \eta _t \le \frac{1}{\rho } $ can be obtained from the setting $\eta _t = \frac{1}{\rho } \sqrt{\frac{1}{T}}$, we can obtain

(34)

By dividing both the left side and the right side by $ \frac{\eta _t}{2} $, we have

(35)

According to Eq. (13), we have $\sum _{k\in S_t} p_k^2 = \sum _{k\in S_t} {\left( \frac{M_k^t}{mt} \right) }^2$. As $\eta _t = \frac{1}{\rho } \sqrt{\frac{1}{T}}$, we can sum Eq. (35) from $t=0$ to $T-1$ and obtain

where $\omega ^*$ is the optimal solution.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tan, L., Hu, M., Zhou, Y., Wu, D. (2023). Analyzing the Convergence of Federated Learning with Biased Client Participation. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14177. Springer, Cham. https://doi.org/10.1007/978-3-031-46664-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-46664-9_29
Published: 05 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46663-2
Online ISBN: 978-3-031-46664-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Analyzing the Convergence of Federated Learning with Biased Client Participation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-Model Federated Learning with Provable Guarantees

Beyond Random Selection: A Perspective from Model Inversion in Personalized Federated Learning

FedSmart: An Auto Updating Federated Learning Optimization Mechanism

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendiex

Appendiex

1.1 Proof of Lemma 1

1.2 Proof of Lemma 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us