A Learning-Based Framework for Low Bit-Rate Image and Video Coding

Xiong, Hongkai; Yuan, Zhe; Xu, Yang

doi:10.1007/978-3-642-10467-1_20

Hongkai Xiong²²,
Zhe Yuan²² &
Yang Xu²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5879))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1426 Accesses
2 Citations

Abstract

There is a major research effort under way to improve image and video coding efficiency through exploiting visual redundancy, in alignment with traditionally predictive coding and transform coding. It is motivated from the fact that natural images not only can be generally decomposed into texture and piecewise smooth parts called cartoon (e.g. edges), but may be recognized to consist of an overwhelming number of visual patterns generated by very diverse stochastic processes in nature. This paper explores perceptual non-parametric sampling methods into standardized video engine with structure-based prediction, and further suggests a learning-based framework for compressing image and video at low bit rate, by incorporating effective state-of-the-art inference algorithms to pursue an online synthesis solution. A crucial component is presented to learn the relationship (projection) between the abstracted patches (visual pattern) and the corresponding detail (feature space) in spatio-temporal manner. The experiment result shows the promising prospect for perceptual image and video coding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Overview of intelligent video coding: from model-based to learning-based approaches

Article Open access 02 August 2023

Machine Learning for Multiscale Video Coding

Article 25 September 2023

Video Codec Using Machine Learning Image Compression Techniques

References

Zhu, S.C.: Statistical modeling and conceptualization of visual patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(6), 691–712 (2003)
Article Google Scholar
Yin, W.: Image Cartoon-Texture decomposition and feature selection using the total variation regularized L1 function. In: Variational, Geometric, and Level Set Methods in Computer Vision, October 2005, pp. 73–84 (2005)
Google Scholar
Kwatra, V., Schodl, A., Essa, I., et al.: Graphcut textures: image and video synthesis using graph cuts. In: Proc. of SIGGRAPH, July 2003, pp. 277–286 (2003)
Google Scholar
Efros, A., Freeman, W.: Image quilting for texture synthesis and transfer. In: Proc. of SIGGRAPH, August 2001, pp. 341–346 (2001)
Google Scholar
Wang, C., Sun, X., Wu, F., Xiong, H.K.: Image compression with structure-aware inpainting. In: Proc. of IEEE Symposium on Circuits and Systems, September 2006, pp. 21–24 (2006)
Google Scholar
Liu, D., Sun, X., Wu, F., et al.: Image compression with edge-based inpainting. IEEE Trans. on Circuits and Systems for Video Technology 17(10), 1273–1287 (2007)
Article Google Scholar
Zhu, C.B., Sun, X.Y., Wu, F., Li, H.Q.: Video coding with spatio-temporal texture synthesis and edge-based inpainting. In: IEEE International Conference on Multimedia and Expo., June 2008, pp. 812–816 (2008)
Google Scholar
Ndjiki-Nya, P., Hinz, T., Wiegand, T.: Generic and robust video coding with texture analysis and synthesis. In: IEEE International Conference on Multimedia, July 2007, pp. 1447–1450 (2007)
Google Scholar
Dumitras, A., Haskell, B.G.: An Encoder-Decoder texture replacement method with application to content-based method. IEEE Trans. on Circuits and Systems for Video Technology 14(6), 825–840 (2004)
Article Google Scholar
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. on Image Process 13(9), 1200–1212 (2004)
Article Google Scholar
Komodakis, N., Tziritas, G.: Image completion using efficient belief propagation via priority scheduling and dynamic pruning. IEEE Trans. on Image Processing 16(11), 2649–2661 (2007)
Article MathSciNet Google Scholar
Li, Y., Sun, X.Y., Xiong, H.K., Wu, F.: Incorporating primal sketch based learning low bit-rate image compression. In: IEEE International Conference on Image Processing, vol. 3, pp. 173–176 (2007)
Google Scholar
Jun, X.J., Wu, X.L.: Can low resolution be better? In: Data Compression Conference, pp. 302–311 (2008)
Google Scholar
Li, X., Orchard, M.: New edge directed interpolation. IEEE Trans. on Image Processing 10, 1521–1527 (2001)
Article Google Scholar
Egiazarian, K., Foi, A., Katkovnik, V.: Compressed sensing image reconstruction via recursive spatially adaptive filtering. In: IEEE International Conference on Image Processing, San Antonio, TX, USA, September 2007, vol. 1 (2007)
Google Scholar
Yuan, Z., Xiong, H.K., Song, L., Zheng, Y.F.: Generic video coding with abstraction and completion. In: IEEE International Conference on Acoustic, Speech and Signal Processing (April 2009)
Google Scholar
Wexler, Y., Shechtman, E., Irani, M.: Spatio-temporal completion of video. IEEE Trans. on Pattern Analysis and Machine Intelligence 29(3), 463–476 (2007)
Article Google Scholar
Wang, Y.-P., Lee, S.L., Toraichi, K.: Multiscale curvature-Based Shape Representation Using B-Spline Wavelets. IEEE Transactions on Image Processing 8(11), 1568–1592 (1999)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–104 (2004)
Article Google Scholar
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, January 1998, pp. 839–846 (1998)
Google Scholar
Orchard, M.T.: Overlapped block motion compression: an estimation theoretic approach. IEEE Trans. on Image Processing 3(5), 693–699 (1994)
Article Google Scholar
Hornik, K., Stinchcombe, White, H.: Multi-layer feedfoward networks are Universal approximaters. Neural Networks 2, 259–266 (1989)
Article Google Scholar
Chou, C.H., Li, Y.C.: A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. IEEE Trans. on Circuits and Systems for Video Technology 5(6), 467–476 (1995)
Article Google Scholar
Bouguet, J.Y.: Pyramidal Implementation of the Lucas Kanade Feature Tracker: Description of the Algorithm. Intel Research Laboratory Technical Report (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Electronic Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Hongkai Xiong, Zhe Yuan & Yang Xu

Authors

Hongkai Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Yang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Naresuan University, 65000, Phisanulok, Thailand
Paisarn Muneesawang
Microsoft Research Asia, 100109, Beijing, China
Feng Wu
Tokyo Institute of Technology, 226-8503, Yokohama, Japan
Itsuo Kumazawa
Mahanakorn University of Technology, 10530, Bankok, Thailand
Athikom Roeksabutr
Institute of Information Science, Academia Sinica, Taipei, Taiwan
Mark Liao
Chinese University of Hong Kong, Shatin, N.T., Hong Kong,
Xiaoou Tang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiong, H., Yuan, Z., Xu, Y. (2009). A Learning-Based Framework for Low Bit-Rate Image and Video Coding. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-10467-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics