Abstract
Knowledge Distillation (KD) aims to learn a compact student network using knowledge from a large pre-trained teacher network, where both networks are trained on data from the same distribution. However, in practical applications, the student network may be required to perform in a new scenario (i.e., the target domain), which usually exhibits significant differences from the known scenario of the teacher network (i.e., the source domain). The traditional domain adaptation techniques can be integrated with KD in a two-stage process to bridge the domain gap, but the ultimate reliability of two-stage approaches tends to be limited due to the high computational consumption and the additional errors accumulated from both stages. To solve this problem, we propose a new one-stage method dubbed “Direct Distillation between Different Domains” (4Ds). We first design a learnable adapter based on the Fourier transform to separate the domain-invariant knowledge from the domain-specific knowledge. Then, we build a fusion-activation mechanism to transfer the valuable domain-invariant knowledge to the student network, while simultaneously encouraging the adapter within the teacher network to learn the domain-specific knowledge of the target data. As a result, the teacher network can effectively transfer categorical knowledge that aligns with the target domain of the student network. Intensive experiments on various benchmark datasets demonstrate that our proposed 4Ds method successfully produces reliable student networks and outperforms state-of-the-art approaches. Code is available at https://github.com/tangjialiang97/4Ds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Each Fourier coefficient \(\boldsymbol{\mathcal {F}}^{T}_{\text {ad}}(u, v)=\frac{1}{H W} \sum _{h=1}^{H} \sum _{w=1}^{W} \textbf{f}_{\text {ad}}^{T}(h, w) e ^{-\textrm{i} 2 \pi \left( \frac{u h}{H}+\frac{v w}{W}\right) }=\boldsymbol{\mathcal {F}}^{T}_{\text {ad}\_\text {real}}(u, v)+\textrm{i} \boldsymbol{\mathcal {F}}^{T}_{\text {ad}\_\text {img}}(u, v),\) where \(\textrm{i}\) is the imaginary unit, \(\boldsymbol{\mathcal {F}}^{T}_{\text {ad}\_\text {real}}\) and \(\boldsymbol{\mathcal {F}}^{T}_{\text {ad}\_\text {img}}\) are the real and imaginary parts, respectively.
- 2.
Each \(\textbf{f}_{\text {ift}}^{T}(h, w)\) is computed as: \(\textbf{f}_{\text {ift}}^{T}(h, w)=\frac{1}{U V} \sum _{u=1}^{U} \sum _{v=1}^{V} \boldsymbol{\mathcal {F}}_{\text {ref}}^{T}(u, v) e ^{\textrm{i} 2 \pi \left( \frac{u h}{U}+\frac{v w}{V}\right) }\).
References
Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9163–9171 (2019)
Chen, D., Mei, J.P., Zhang, H., Wang, C., Feng, Y., Chen, C.: Knowledge distillation with the reused teacher classifier. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11933–11942 (2022)
Chen, D., Mei, J.P., Zhang, Y., Wang, C., Wang, Z., Feng, Y., Chen, C.: Cross-layer distillation with semantic calibration. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 7028–7036 (2021)
Chen, G., Peng, P., Ma, L., Li, J., Du, L., Tian, Y.: Amplitude-phase recombination: Rethinking robustness of convolutional neural networks in frequency domain. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 458–467 (2021)
Chen, H., Guo, T., Xu, C., Li, W., Xu, C., Xu, C., Wang, Y.: Learning student networks in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6428–6437 (2021)
Chen, H., et al.: Data-free learning of student networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3514–3522 (2019)
Chen, P., Liu, S., Zhao, H., Jia, J.: Distilling knowledge via knowledge review. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5008–5017 (2021)
Choi, Y., Choi, J., El-Khamy, M., Lee, J.: Data-free network quantization with adversarial knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPR), pp. 710–711 (2020)
Cooley, J.W., Lewis, P.A., Welch, P.D.: The fast fourier transform and its applications. IEEE Trans. Educ. (ToE) 12(1), 27–34 (1969)
Ding, Y., Sheng, L., Liang, J., Zheng, A., He, R.: Proxymix: proxy-based mixup training with label refinery for source-free domain adaptation. Neural Networks (NN) 167, 92–103 (2023)
Do, K., et al.: Momentum adversarial distillation: Handling large distribution shifts in data-free knowledge distillation. In: Advances in Neural Information Processing Systems (NeurIPS) (2022)
Dong, C., Li, Y., Shen, Y., Qiu, M.: Hrkd: hierarchical relational knowledge distillation for cross-domain language model compression. arXiv preprint arXiv:2110.08551 (2021)
Fang, G., Bao, Y., Song, J., Wang, X., Xie, D., Shen, C., Song, M.: Mosaicking to distill: Knowledge distillation from out-of-domain data. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 11920–11932 (2021)
Feng, W., Ju, L., Wang, L., Song, K., Zhao, X., Ge, Z.: Unsupervised domain adaptation for medical image segmentation by selective entropy constraints and adaptive semantic alignment. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 623–631 (2023)
Gao, H., Guo, J., Wang, G., Zhang, Q.: Cross-domain correlation distillation for unsupervised domain adaptation in nighttime semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9913–9923 (2022)
Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2066–2073 (2012)
Gong, X., et al.: Preserving privacy in federated learning with ensemble cross-domain knowledge distillation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 11891–11899 (2022)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. Int. J. Comput. Vis. (IJCV) 129(6), 1789–1819 (2021)
Guo, Z., Yan, H., Li, H., Lin, X.: Class attention transfer based knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11868–11877 (2023)
Hao, Z., Guo, J., Han, K., Tang, Y., Hu, H., Wang, Y., Xu, C.: One-for-all: Bridge the gap between heterogeneous architectures in knowledge distillation. Advances in Neural Information Processing Systems (NeurIPs) 36 (2024)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)
Huang, J., Guan, D., Xiao, A., Lu, S.: Fsdr: Frequency space domain randomization for domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6891–6902 (2021)
Huang, J., Guan, D., Xiao, A., Lu, S.: Model adaptation: Historical contrastive learning for unsupervised domain adaptation without source data. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 3635–3649 (2021)
Huang, T., You, S., Wang, F., Qian, C., Xu, C.: Knowledge distillation from a stronger teacher. In: Advances in Neural Information Processing Systems (NeurIPS) 35, pp. 33716–33727 (2022)
Islam, A., Chen, C.F.R., Panda, R., Karlinsky, L., Feris, R., Radke, R.J.: Dynamic distillation network for cross-domain few-shot recognition with unlabeled data. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 3584–3595 (2021)
Jiande, F., Weixin, X., Zongxiang, L.: A low complexity distributed multitarget detection and tracking algorithm. Chinese J. Electron. (CJE) 32(3), 429–437 (2023)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lee, S., Bae, J., Kim, H.Y.: Decompose, adjust, compose: effective normalization by playing with frequency for domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11776–11785 (2023)
Li, J., Li, G., Shi, Y., Yu, Y.: Cross-domain adaptive clustering for semi-supervised domain adaptation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2505–2514 (2021)
Li, J., Yu, Z., Du, Z., Zhu, L., Shen, H.T.: A comprehensive survey on source-free domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) (2024)
Li, L.: Self-regulated feature learning via teacher-free feature distillation. In: ECCV (2022)
Li, L., Dong, P., Wei, Z., Yang, Y.: Automated knowledge distillation via monte carlo tree search. In: ICCV (2023)
Li, W., Fan, K., Yang, H.: Teacher-student mutual learning for efficient source-free unsupervised domain adaptation. Knowl.-Based Syst. 261, 110204 (2023)
Liang, J., Hu, D., Wang, Y., He, R., Feng, J.: Source data-absent unsupervised domain adaptation through hypothesis transfer and labeling transfer. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 44(11), 8602–8617 (2021)
Long, M., Wang, J., Ding, G., Sun, J., Yu, P.S.: Transfer joint matching for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1410–1417 (2014)
Lu, W., Wang, J., Li, H., Chen, Y., Xie, X.: Domain-invariant feature exploration for domain generalization. arXiv preprint arXiv:2207.12020 (2022)
Oppenheim, A.V., Lim, J.S.: The importance of phase in signals. Proc. IEEE 69(5), 529–541 (1981)
Park, D.Y., Cha, M.H., Kim, D., Han, B., et al.: Learning student-friendly teacher networks for knowledge distillation. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 13292–13303 (2021)
Peng, B., et al.: Correlation congruence for knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5007–5016 (2019)
Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., Wang, B.: Moment matching for multi-source domain adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1406–1415 (2019)
Pham, C., Nguyen, V.A., Le, T., Phung, D., Carneiro, G., Do, T.T.: Frequency attention for knowledge distillation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 2277–2286 (2024)
Piotrowski, L.N., Campbell, F.W.: A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception 11(3), 337–346 (1982)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. arXiv:1412.6550 (2014)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR) (2014)
Tang, J., Chen, S., Niu, G., Sugiyama, M., Gong, C.: Distribution shift matters for knowledge distillation with webly collected images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Tung, F., Mori, G.: Similarity-preserving knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1365–1374 (2019)
Tzeng, E., Hoffman, J., Saenko, K., Darrell, T.: Adversarial discriminative domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7167–7176 (2017)
Venkateswara, H., Eusebio, J., Chakraborty, S., Panchanathan, S.: Deep hashing network for unsupervised domain adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5018–5027 (2017)
Wang, Z., Li, C., Wang, X.: Convolutional neural network pruning with structural redundancy reduction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14913–14922 (2021)
Xu, Q., Zhang, R., Zhang, Y., Wang, Y., Tian, Q.: A fourier-based framework for domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14383–14392 (2021)
Yang, J., Martinez, B., Bulat, A., Tzimiropoulos, G.: Knowledge distillation via adaptive instance normalization. arXiv:2003.04289 (2020)
Yang, S., Wang, Y., Herranz, L., Jui, S., van de Weijer, J.: Casting a bait for offline and online source-free domain adaptation. Computer Vision and Image Understanding (CVIU), p. 103747 (2023)
Yang, S., van de Weijer, J., Herranz, L., Jui, S., et al.: Exploiting the intrinsic neighborhood structure for source-free domain adaptation. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 29393–29405 (2021)
Yang, Z., Li, Z., Shao, M., Shi, D., Yuan, Z., Yuan, C.: Masked generative distillation. In: European Conference on Computer Vision (ECCV), pp. 53–69. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20083-0_4
Yang, Z., Zeng, A., Li, Z., Zhang, T., Yuan, C., Li, Y.: From knowledge distillation to self-knowledge distillation: a unified approach with normalized loss and customized soft labels. arXiv preprint arXiv:2303.13005 (2023)
Zhang, B., Zhang, X., Liu, Y., Cheng, L., Li, Z.: Matching distributions between model and data: Cross-domain knowledge distillation for unsupervised domain adaptation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 5423–5433 (2021)
Zhang, R., Xie, C., Deng, L.: A fine-grained object detection model for aerial images based on yolov5 deep neural network. Chinese J. Electron. (CJE) 32(1), 51–63 (2023)
Zhang, Z., et al.: Divide and contrast: Source-free domain adaptation via adaptive contrastive learning. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 5137–5149 (2022)
Zhao, B., Cui, Q., Song, R., Qiu, Y., Liang, J.: Decoupled knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11953–11962 (2022)
Zhao, S., et al.: Multi-source distilling domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12975–12983 (2020)
Zhou, K., Yang, Y., Hospedales, T., Xiang, T.: Deep domain-adversarial image generation for domain generalisation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 13025–13032 (2020)
Acknowledgment
C. Gong was supported by NSF of China (Nos: 62336003, 12371510), NSF of Jiangsu Province (No: BZ2021013), NSF for Distinguished Young Scholar of Jiangsu Province (No: BK20220080), the Fundamental Research Funds for the Central Universities (Nos: 30920032202, 30921013114), the “111” Program (No: B13022). H. Zhu was supported by A*STAR AME Programmatic Funding (No: A18A2b0046), the RobotHTPO Seed Fund under Project (No: C211518008), and the EDB Space Technology Development Grant under Project (No: S22-19016-STDP). M. Sugiyama was supported by JST CREST Grant Number JPMJCR18A2 and a grant from Apple, Inc. J. Zhou was supported by the National Research Foundation, Prime Minister’s Office, Singapore, and the Ministry of Communications and Information, under its Online Trust and Safety (OTS) Research Programme (No: MCl-OTS-001) and SERC CentralResearch Fund (Use-inspired Basic Research). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of Apple, Inc., National Research Foundation, Singapore, or the Ministry of Communications and Information.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tang, J. et al. (2025). Direct Distillation Between Different Domains. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15138. Springer, Cham. https://doi.org/10.1007/978-3-031-72989-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-72989-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72988-1
Online ISBN: 978-3-031-72989-8
eBook Packages: Computer ScienceComputer Science (R0)