Reconstruction of Multi-view Video Based on GAN

Li, Song; Lan, Chengdong; Zhao, Tiesong

doi:10.1007/978-3-030-00767-6_57

Song Li¹⁸,
Chengdong Lan¹⁸ &
Tiesong Zhao¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11165))

Included in the following conference series:

Pacific Rim Conference on Multimedia

2616 Accesses

Abstract

There is a huge amount of data in multi-view video which brings enormous challenges to the compression, storage, and transmission of video data. Transmitting part of the viewpoint information is a prior solution to reconstruct the original multi-viewpoint information. They are all based on pixel matching to obtain the correlation between adjacent viewpoint images. However, pixels cannot express the invariability of image features and are susceptible to noise. Therefore, in order to overcome the above problems, the VGG network is used to extract the high-dimensional features between the images, indicating the relevance of the adjacent images. The GAN is further used to more accurately generate virtual viewpoint images. We extract the lines at the same positions of the viewpoints as local areas for image merging and input the local images into the network. In the reconstruction viewpoint, we generate a local image of a dense viewpoint through the GAN network. Experiments on multiple test sequences show that the proposed method has a 0.2–0.8-dB PSNR and 0.15–0.61 MOS improvement over the traditional method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

View upsampling optimization for mixed resolution 3D video coding

Article 06 May 2015

An efficient defogging network for RAW image sequences with high viewpoint

Article 12 December 2024

VIPNet: Combining Viewpoint Information and Shape Priors for Instant Multi-view 3D Reconstruction

References

Schreer, O., Thomas, G., Niamut, O.A.: Format-agnostic approach for production, delivery and rendering of immersive media. In: Proceedings of the 4th ACM Multimedia Systems Conference, pp. 249–260 (2013)
Google Scholar
Seguin, D.: 3D at the B.O: avatar has changed everything. Canada’s Broadcast Prod. J. 2(5), 43–44 (2010)
Google Scholar
Żmigrodzka, M., Wiśniowski, W.: Development of virtual reality technology in the aspect of educational applications. Mark. Sci. Res. Organ. (2017)
Google Scholar
Chirico, A., Lucidi, F., Milanese, C.: Virtual reality in health system: beyond entertainment. A mini-review on the efficacy of VR during cancer treatment. J. Cell. Physiol. 231(2), 275–287 (2016)
Article Google Scholar
Xiao, J., Hannuksela, M.M., Tillo, T., Gabbouj, M., Zhu, C., Zhao, Y.: Scalable bit allocation between texture and depth views for 3-D video streaming over heterogeneous networks. IEEE Trans. Circuits Syst. 25(1), 139–152 (2015)
Google Scholar
Diogo, C., Garcia, G., Camilo Dorea, C., Ricardo, L.: Super resolution for multiview images using depth information. IEEE Trans. Circuits Syst. Video Technol. 20(3), 132–135 (2012)
Google Scholar
Aflaki, P., Hannuksela, M.M., Hakkinen, J., Lindroos, P., Gabbouj, M.: Subjective study on compressed asymmetric stereoscopic video. In: Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), vol. 2, pp. 4021–4024. IEEE (2010)
Google Scholar
Richter, T., Seiler, J., Schnurrer, W., Kaup, A.: Robust super-resolution for mixed-resolution multiview image plus depth data. IEEE Trans. Circuits Syst. Video Technol. 26(5), 814–828 (2016)
Article Google Scholar
Jin, Z., Tillo, T., Xiao, J., Zhao, Y.: Multiview video plus depth transmission via virtual-view-assisted complementary down/upsampling. EURASIP J. Image Video Process. 2016, 19 (2016)
Article Google Scholar
Horng, Y.-R., Tseng, Y.C., Chang, T.-S.: Stereoscopic images generation with directional Gaussian filter. In: Proceedings of the IEEE International Symposium on Circuits Systems, pp. 2650–2653. IEEE (2010)
Google Scholar
Lee, P.J., Effendi, : Nongeometric distortion smoothing approach for depth map preprocessing. IEEE Trans. Multimed. 13(2), 246–254 (2011)
Article Google Scholar
Do, L., Zinger, S.: Quality improving techniques for free-viewpoint DIBR. In: IS&T/SPIE Electronic Imaging, pp. 75240I–75240I-10. SPIE Press, Boston (2010)
Google Scholar
Rahaman, D.M.M., Paul, M.: Free view-point video synthesis using Gaussian mixture modelling. In: Proceedings of the IEEE Conference on Image and Vision Computing New Zealand, pp. 1–6. IEEE (2015)
Google Scholar
Oliveira, A., Fickel, G., Walter, M., Jung, C.: Selective hole-filling for depth-image based rendering. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1186–1190 (2015)
Google Scholar
Yao, C., Zhao, Y., Xiao, J., Bai, H., Lin, C.: Depth map driven hole filling algorithm exploiting temporal correlation information. IEEE Trans. Broadcast. 60(2), 394–404 (2014)
Article Google Scholar
Dmm, R., Paul, M.: Virtual view synthesis for free viewpoint video and multiview video compression using gaussian mixture modelling. IEEE Trans. Image Process. 27(3), 1190–1201 (2018)
Article MathSciNet Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409-1556 (2014)
Ledig, C., Theis, L., Huszar, F., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 105–114 (2016)
Google Scholar
Wu, G., Zhao, M., Wang, L., Dai, Q., Chai, T., Liu, Y.: Light field reconstruction using deep convolutional network on EPI. In: Proceedings of the IEEE Computer Vision and Pattern Recognition, pp. 6317–6327 (2017)
Google Scholar

Download references

Acknowledgement

This research was funded by the National Natural Science Foundation of China (No. 61671152, No. 61471124); the Natural Science Foundation of Fujian Province of China (No. 2014J01234, No. 2017J01757).

Author information

Authors and Affiliations

School of Physics and Information Engineering, Fuzhou University, Fuzhou, China
Song Li, Chengdong Lan & Tiesong Zhao

Authors

Song Li
View author publications
You can also search for this author in PubMed Google Scholar
Chengdong Lan
View author publications
You can also search for this author in PubMed Google Scholar
Tiesong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengdong Lan .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, S., Lan, C., Zhao, T. (2018). Reconstruction of Multi-view Video Based on GAN. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_57

Download citation

DOI: https://doi.org/10.1007/978-3-030-00767-6_57
Published: 19 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics