Abstract
In recent times, there has been notable progress in the effectiveness of Generative Adversarial Networks (GANs) for synthesizing images. Consequently, numerous studies have started utilizing GANs for image editing purposes. To enable editing of real images, it is crucial to embed a real image into the latent space of GANs. This involves obtaining the latent code of the real image and subsequently modifying the image by altering the latent code. However, accurately reconstructing the real image using the obtained latent code remains a challenge. This paper introduces a novel inversion scheme that achieves high accuracy. In contrast to conventional approaches that employ a single encoder for image inversion, our method utilizes collaborative encoders to accomplish the inversion task. Specifically, two encoders are employed to invert distinct regions in the image, namely the face and background regions. By distributing the inversion task between these encoders, the burden on a single encoder is reduced. Furthermore, to optimize efficiency, this paper adopts a lightweight network structure, resulting in faster inference speed. Experimental results demonstrate that our proposed method significantly enhances visual quality and improves the speed of inference. By leveraging collaborative encoders and a lightweight network structure, we achieve notable improvements in image inversion, thus enabling more effective image editing capabilities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goodfellow, I.P.A., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Karras, T.L., et al.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Karras, T.L., et al.: Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
H ̀ˆark ̀ˆonen, E., et al.: Ganspace: discovering interpretable GAN controls. Adv. Neural Inf. Process. Syst. 33, 9841–9850 (2020)
Shen, Y., et al.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9243–9252 (2020)
Nitzan, Y., et al.: Face identity disentanglement via latent space mapping. arXiv preprint arXiv:2005.07728 (2020)
Richardson, E., et al.: Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2287–2296 (2021)
Tov, O., et al.: Designing an encoder for stylegan image manipulation. ACM Trans. Graph. (TOG) 40(4), 1–14 (2021)
Lin, S., et al.: Robust high-resolution video matting with temporal guidance. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 238–247 (2022)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Lin, T.-Y., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Zhang, R., et al.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Deng, J., et al.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Karras, T., et al.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Huang, Y., et al.: Curricularface: adaptive curriculum learning loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5901–5910 (2020)
Alaluf, Y., et al.: Restyle: a residual-based stylegan encoder via iterative refinement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6711–6720 (2021)
Wang, T., et al.: High-fidelity GAN inversion for image attribute editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11379–11388 (2022)
Hu, X., et al.: Style transformer for image inversion and editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11337–11346 (2022)
Acknowledgement
. This work was supported by the National Natural Science Foundation of China (Nos. 62072002, 62172004 and U19A2064), and Special Fund for Anhui Agriculture Research System (2021–2025).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liu, Y., Zheng, C., Zhang, J., Wang, B., Chen, P. (2023). Collaborative Encoder for Accurate Inversion of Real Face Image. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14087. Springer, Singapore. https://doi.org/10.1007/978-981-99-4742-3_41
Download citation
DOI: https://doi.org/10.1007/978-981-99-4742-3_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4741-6
Online ISBN: 978-981-99-4742-3
eBook Packages: Computer ScienceComputer Science (R0)