Abstract
There is a huge amount of data in multi-view video which brings enormous challenges to the compression, storage, and transmission of video data. Transmitting part of the viewpoint information is a prior solution to reconstruct the original multi-viewpoint information. They are all based on pixel matching to obtain the correlation between adjacent viewpoint images. However, pixels cannot express the invariability of image features and are susceptible to noise. Therefore, in order to overcome the above problems, the VGG network is used to extract the high-dimensional features between the images, indicating the relevance of the adjacent images. The GAN is further used to more accurately generate virtual viewpoint images. We extract the lines at the same positions of the viewpoints as local areas for image merging and input the local images into the network. In the reconstruction viewpoint, we generate a local image of a dense viewpoint through the GAN network. Experiments on multiple test sequences show that the proposed method has a 0.2–0.8-dB PSNR and 0.15–0.61 MOS improvement over the traditional method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schreer, O., Thomas, G., Niamut, O.A.: Format-agnostic approach for production, delivery and rendering of immersive media. In: Proceedings of the 4th ACM Multimedia Systems Conference, pp. 249–260 (2013)
Seguin, D.: 3D at the B.O: avatar has changed everything. Canada’s Broadcast Prod. J. 2(5), 43–44 (2010)
Żmigrodzka, M., Wiśniowski, W.: Development of virtual reality technology in the aspect of educational applications. Mark. Sci. Res. Organ. (2017)
Chirico, A., Lucidi, F., Milanese, C.: Virtual reality in health system: beyond entertainment. A mini-review on the efficacy of VR during cancer treatment. J. Cell. Physiol. 231(2), 275–287 (2016)
Xiao, J., Hannuksela, M.M., Tillo, T., Gabbouj, M., Zhu, C., Zhao, Y.: Scalable bit allocation between texture and depth views for 3-D video streaming over heterogeneous networks. IEEE Trans. Circuits Syst. 25(1), 139–152 (2015)
Diogo, C., Garcia, G., Camilo Dorea, C., Ricardo, L.: Super resolution for multiview images using depth information. IEEE Trans. Circuits Syst. Video Technol. 20(3), 132–135 (2012)
Aflaki, P., Hannuksela, M.M., Hakkinen, J., Lindroos, P., Gabbouj, M.: Subjective study on compressed asymmetric stereoscopic video. In: Proceedings of the 17th IEEE International Conference on Image Processing (ICIP), vol. 2, pp. 4021–4024. IEEE (2010)
Richter, T., Seiler, J., Schnurrer, W., Kaup, A.: Robust super-resolution for mixed-resolution multiview image plus depth data. IEEE Trans. Circuits Syst. Video Technol. 26(5), 814–828 (2016)
Jin, Z., Tillo, T., Xiao, J., Zhao, Y.: Multiview video plus depth transmission via virtual-view-assisted complementary down/upsampling. EURASIP J. Image Video Process. 2016, 19 (2016)
Horng, Y.-R., Tseng, Y.C., Chang, T.-S.: Stereoscopic images generation with directional Gaussian filter. In: Proceedings of the IEEE International Symposium on Circuits Systems, pp. 2650–2653. IEEE (2010)
Lee, P.J., Effendi, : Nongeometric distortion smoothing approach for depth map preprocessing. IEEE Trans. Multimed. 13(2), 246–254 (2011)
Do, L., Zinger, S.: Quality improving techniques for free-viewpoint DIBR. In: IS&T/SPIE Electronic Imaging, pp. 75240I–75240I-10. SPIE Press, Boston (2010)
Rahaman, D.M.M., Paul, M.: Free view-point video synthesis using Gaussian mixture modelling. In: Proceedings of the IEEE Conference on Image and Vision Computing New Zealand, pp. 1–6. IEEE (2015)
Oliveira, A., Fickel, G., Walter, M., Jung, C.: Selective hole-filling for depth-image based rendering. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1186–1190 (2015)
Yao, C., Zhao, Y., Xiao, J., Bai, H., Lin, C.: Depth map driven hole filling algorithm exploiting temporal correlation information. IEEE Trans. Broadcast. 60(2), 394–404 (2014)
Dmm, R., Paul, M.: Virtual view synthesis for free viewpoint video and multiview video compression using gaussian mixture modelling. IEEE Trans. Image Process. 27(3), 1190–1201 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409-1556 (2014)
Ledig, C., Theis, L., Huszar, F., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 105–114 (2016)
Wu, G., Zhao, M., Wang, L., Dai, Q., Chai, T., Liu, Y.: Light field reconstruction using deep convolutional network on EPI. In: Proceedings of the IEEE Computer Vision and Pattern Recognition, pp. 6317–6327 (2017)
Acknowledgement
This research was funded by the National Natural Science Foundation of China (No. 61671152, No. 61471124); the Natural Science Foundation of Fujian Province of China (No. 2014J01234, No. 2017J01757).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, S., Lan, C., Zhao, T. (2018). Reconstruction of Multi-view Video Based on GAN. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11165. Springer, Cham. https://doi.org/10.1007/978-3-030-00767-6_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-00767-6_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00766-9
Online ISBN: 978-3-030-00767-6
eBook Packages: Computer ScienceComputer Science (R0)