Knowledge-Distillation-Warm-Start Training Strategy for Lightweight Super-Resolution Networks

Lei, Min; He, Kun; Xu, Hui; Yang, Yunfeng; Shao, Jie

doi:10.1007/978-981-99-8148-9_22

Min Lei^10,11,
Kun He^10,11,
Hui Xu^10,11,12,
Yunfeng Yang¹³ &
…
Jie Shao^10,11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1966))

Included in the following conference series:

International Conference on Neural Information Processing

833 Accesses

Abstract

In recent years, studies on lightweight networks have made rapid progress in the field of image Super-Resolution (SR). Although the lightweight SR network is computationally efficient and saves parameters, the simplification of the structure inevitably leads to limitations in its performance. To further enhance the efficacy of lightweight networks, we propose a Knowledge-Distillation-Warm-Start (KDWS) training strategy. This strategy enables further optimization of lightweight networks using dark knowledge from traditional large-scale SR networks during warm-start training and can empirically improve the performance of lightweight models. For experiment, we have chosen several traditional large-scale SR networks and lightweight networks as teacher and student networks, respectively. The student network is initially trained with a conventional warm-start strategy, followed by additional supervision from the teacher network for further warm-start training. The evaluation on common test datasets shows that our proposed training strategy can result in better performance for a lightweight SR network. Furthermore, our proposed approach can also be adopted in any deep learning network training process, not only image SR tasks, as it is not limited by network structure or task type.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution

Image super-resolution: prefix-tuning transformer from large to small datasets

Article 04 January 2024

S $$^2$$ R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution

References

Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: dataset and study. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2017, pp. 1122–1131 (2017)
Google Scholar
Ahn, S., Hu, S.X., Damianou, A.C., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 9163–9171 (2019)
Google Scholar
Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: British Machine Vision Conference, BMVC 2012, pp. 1–10 (2012)
Google Scholar
Chang, J., Lu, Y., Xue, P., Xu, Y., Wei, Z.: Global balanced iterative pruning for efficient convolutional neural networks. Neural Comput. Appl. 34(23), 21119–21138 (2022)
Article Google Scholar
Chen, D., Mei, J., Zhang, H., Wang, C., Feng, Y., Chen, C.: Knowledge distillation with the reused teacher classifier. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 11923–11932 (2022)
Google Scholar
Clancy, K., Aboutalib, S.S., Mohamed, A.A., Sumkin, J.H., Wu, S.: Deep learning pre-training strategy for mammogram image classification: an evaluation study. J. Digit. Imaging 33(5), 1257–1265 (2020)
Article Google Scholar
Dai, T., Cai, J., Zhang, Y., Xia, S., Zhang, L.: Second-order attention network for single image super-resolution. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, pp. 11065–11074 (2019)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014 Part IV. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_13
Chapter Google Scholar
Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016 Part II. LNCS, vol. 9906, pp. 391–407. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_25
Chapter Google Scholar
Du, Z., Liu, D., Liu, J., Tang, J., Wu, G., Fu, L.: Fast and memory-efficient network towards efficient image super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 852–861 (2022)
Google Scholar
Gao, Q., Zhao, Y., Li, G., Tong, T.: Image super-resolution using knowledge distillation. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11362, pp. 527–541. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20890-5_34
Chapter Google Scholar
Garg, A., Gowda, D., Kumar, A., Kim, K., Kumar, M., Kim, C.: Improved multi-stage training of online attention-based encoder-decoder models. In: IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019, pp. 70–77 (2019)
Google Scholar
Gonzalez, S., Miikkulainen, R.: Improved training speed, accuracy, and data utilization through loss function optimization. In: IEEE Congress on Evolutionary Computation, CEC 2020, pp. 1–8 (2020)
Google Scholar
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. 129(6), 1789–1819 (2021)
Article Google Scholar
Guan, Y., et al.: Differentiable feature aggregation search for knowledge distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020 Part XVII. LNCS, vol. 12362, pp. 469–484. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_28
Chapter Google Scholar
Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, pp. 3779–3787 (2019)
Google Scholar
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015)
Google Scholar
Huang, J., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, pp. 5197–5206 (2015)
Google Scholar
Khalifa, N.E.M., Loey, M., Mirjalili, S.: A comprehensive survey of recent trends in deep learning for digital images augmentation. Artif. Intell. Rev. 55(3), 2351–2377 (2022)
Article Google Scholar
Kim, Y., Li, Y., Park, H., Venkatesha, Y., Panda, P.: Neural architecture search for spiking neural networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022 Part XXIV, vol. 13684, pp. 36–56. Springer, Cham (2022)
Chapter Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings (2015)
Google Scholar
Kong, F., et al.: Residual local feature network for efficient super-resolution. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 765–775 (2022)
Google Scholar
Li, Y., et al.: NTIRE 2022 challenge on efficient super-resolution: Methods and results. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, pp. 1061–1101 (2022)
Google Scholar
Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep residual networks for single image super-resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2017, pp. 1132–1140 (2017)
Google Scholar
Liu, J., Tang, J., Wu, G.: Residual feature distillation network for lightweight image super-resolution. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 41–55. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_2
Chapter Google Scholar
Mahmud, T., Sayyed, A.Q.M.S., Fattah, S.A., Kung, S.: A novel multi-stage training approach for human activity recognition from multimodal wearable sensor data using deep neural network. IEEE Sens. J. 21(2), 1715–1726 (2021)
Article Google Scholar
Martin, D.R., Fowlkes, C.C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), vol. 2, pp. 416–425 (2001)
Google Scholar
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, pp. 4092–4101 (2018)
Google Scholar
Raymond, C., Chen, Q., Xue, B., Zhang, M.: Online loss function learning. CoRR abs/2301.13247 (2023)
Google Scholar
Siddegowda, S., Fournarakis, M., Nagel, M., Blankevoort, T., Patel, C., Khobare, A.: Neural network quantization with AI model efficiency toolkit (AIMET). CoRR abs/2201.08442 (2022)
Google Scholar
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017, pp. 464–472 (2017)
Google Scholar
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, pp. 1365–1374 (2019)
Google Scholar
Wang, K., Sun, T., Dou, Y.: An adaptive learning rate schedule for SIGNSGD optimizer in neural networks. Neural Process. Lett. 54(2), 803–816 (2022)
Article Google Scholar
Wang, Z., Li, C., Wang, X.: Convolutional neural network pruning with structural redundancy reduction. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, pp. 14913–14922 (2021)
Google Scholar
Xu, M., Yoon, S., Fuentes, A., Park, D.S.: A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognit. 137, 109347 (2023)
Article Google Scholar
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Curves and Surfaces - 7th International Conference, Revised Selected Papers, pp. 711–730 (2010)
Google Scholar
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: Computer Vision - ECCV 2018–15th European Conference, Proceedings, Part VII, pp. 294–310 (2018)
Google Scholar
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, pp. 11943–11952 (2022)
Google Scholar
Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: Towards lossless CNNs with low-precision weights. In: 5th International Conference on Learning Representations, ICLR 2017, Conference Track Proceedings (2017)
Google Scholar

Download references

Acknowledgments

: This work is supported by the Ministry of Science and Technology of China (No. G2022036009L), Open Fund of Intelligent Terminal Key Laboratory of Sichuan Province (No. SCTLAB-2007), Yibin Science and Technology Program (No. 2021CG003) and Science and Technology Program of Yibin Sanjiang New Area (No. 2023SJXQYBKJJH001).

Author information

Authors and Affiliations

Sichuan Artificial Intelligence Research Institute, Yibin, 644000, China
Min Lei, Kun He, Hui Xu & Jie Shao
University of Electronic Science and Technology of China, Chengdu, 611731, China
Min Lei, Kun He, Hui Xu & Jie Shao
Intelligent Terminal Key Laboratory of Sichuan Province, Yibin, 644000, China
Hui Xu
Yibin Great Technology Co., Ltd., Yibin, 644000, China
Yunfeng Yang

Authors

Min Lei
View author publications
You can also search for this author in PubMed Google Scholar
Kun He
View author publications
You can also search for this author in PubMed Google Scholar
Hui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yunfeng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Xu .

Editor information

Editors and Affiliations

School of Automation, Central South University, Changsha, China
Biao Luo
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Long Cheng
Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, China
Zheng-Guang Wu
School of Automation, Guangdong University of Technology, Guangzhou, China
Hongyi Li
School of Electrical Engineering and Telecommunications, UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lei, M., He, K., Xu, H., Yang, Y., Shao, J. (2024). Knowledge-Distillation-Warm-Start Training Strategy for Lightweight Super-Resolution Networks. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Communications in Computer and Information Science, vol 1966. Springer, Singapore. https://doi.org/10.1007/978-981-99-8148-9_22

Download citation

DOI: https://doi.org/10.1007/978-981-99-8148-9_22
Published: 26 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8147-2
Online ISBN: 978-981-99-8148-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge-Distillation-Warm-Start Training Strategy for Lightweight Super-Resolution Networks