Skip to main content

Collaborative Encoder for Accurate Inversion of Real Face Image

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14087))

Included in the following conference series:

Abstract

In recent times, there has been notable progress in the effectiveness of Generative Adversarial Networks (GANs) for synthesizing images. Consequently, numerous studies have started utilizing GANs for image editing purposes. To enable editing of real images, it is crucial to embed a real image into the latent space of GANs. This involves obtaining the latent code of the real image and subsequently modifying the image by altering the latent code. However, accurately reconstructing the real image using the obtained latent code remains a challenge. This paper introduces a novel inversion scheme that achieves high accuracy. In contrast to conventional approaches that employ a single encoder for image inversion, our method utilizes collaborative encoders to accomplish the inversion task. Specifically, two encoders are employed to invert distinct regions in the image, namely the face and background regions. By distributing the inversion task between these encoders, the burden on a single encoder is reduced. Furthermore, to optimize efficiency, this paper adopts a lightweight network structure, resulting in faster inference speed. Experimental results demonstrate that our proposed method significantly enhances visual quality and improves the speed of inference. By leveraging collaborative encoders and a lightweight network structure, we achieve notable improvements in image inversion, thus enabling more effective image editing capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Goodfellow, I.P.A., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)

    Google Scholar 

  2. Karras, T.L., et al.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)

    Google Scholar 

  3. Karras, T.L., et al.: Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)

    Google Scholar 

  4. H ̀ˆark ̀ˆonen, E., et al.: Ganspace: discovering interpretable GAN controls. Adv. Neural Inf. Process. Syst. 33, 9841–9850 (2020)

    Google Scholar 

  5. Shen, Y., et al.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9243–9252 (2020)

    Google Scholar 

  6. Nitzan, Y., et al.: Face identity disentanglement via latent space mapping. arXiv preprint arXiv:2005.07728 (2020)

  7. Richardson, E., et al.: Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2287–2296 (2021)

    Google Scholar 

  8. Tov, O., et al.: Designing an encoder for stylegan image manipulation. ACM Trans. Graph. (TOG) 40(4), 1–14 (2021)

    Google Scholar 

  9. Lin, S., et al.: Robust high-resolution video matting with temporal guidance. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 238–247 (2022)

    Google Scholar 

  10. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  11. Lin, T.-Y., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  12. Zhang, R., et al.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

    Google Scholar 

  13. Deng, J., et al.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)

    Google Scholar 

  14. Karras, T., et al.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)

  15. Huang, Y., et al.: Curricularface: adaptive curriculum learning loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5901–5910 (2020)

    Google Scholar 

  16. Alaluf, Y., et al.: Restyle: a residual-based stylegan encoder via iterative refinement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6711–6720 (2021)

    Google Scholar 

  17. Wang, T., et al.: High-fidelity GAN inversion for image attribute editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11379–11388 (2022)

    Google Scholar 

  18. Hu, X., et al.: Style transformer for image inversion and editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11337–11346 (2022)

    Google Scholar 

Download references

Acknowledgement

. This work was supported by the National Natural Science Foundation of China (Nos. 62072002, 62172004 and U19A2064), and Special Fund for Anhui Agriculture Research System (2021–2025).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Y., Zheng, C., Zhang, J., Wang, B., Chen, P. (2023). Collaborative Encoder for Accurate Inversion of Real Face Image. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14087. Springer, Singapore. https://doi.org/10.1007/978-981-99-4742-3_41

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4742-3_41

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4741-6

  • Online ISBN: 978-981-99-4742-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics