Abstract
There is a major research effort under way to improve image and video coding efficiency through exploiting visual redundancy, in alignment with traditionally predictive coding and transform coding. It is motivated from the fact that natural images not only can be generally decomposed into texture and piecewise smooth parts called cartoon (e.g. edges), but may be recognized to consist of an overwhelming number of visual patterns generated by very diverse stochastic processes in nature. This paper explores perceptual non-parametric sampling methods into standardized video engine with structure-based prediction, and further suggests a learning-based framework for compressing image and video at low bit rate, by incorporating effective state-of-the-art inference algorithms to pursue an online synthesis solution. A crucial component is presented to learn the relationship (projection) between the abstracted patches (visual pattern) and the corresponding detail (feature space) in spatio-temporal manner. The experiment result shows the promising prospect for perceptual image and video coding.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zhu, S.C.: Statistical modeling and conceptualization of visual patterns. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(6), 691–712 (2003)
Yin, W.: Image Cartoon-Texture decomposition and feature selection using the total variation regularized L1 function. In: Variational, Geometric, and Level Set Methods in Computer Vision, October 2005, pp. 73–84 (2005)
Kwatra, V., Schodl, A., Essa, I., et al.: Graphcut textures: image and video synthesis using graph cuts. In: Proc. of SIGGRAPH, July 2003, pp. 277–286 (2003)
Efros, A., Freeman, W.: Image quilting for texture synthesis and transfer. In: Proc. of SIGGRAPH, August 2001, pp. 341–346 (2001)
Wang, C., Sun, X., Wu, F., Xiong, H.K.: Image compression with structure-aware inpainting. In: Proc. of IEEE Symposium on Circuits and Systems, September 2006, pp. 21–24 (2006)
Liu, D., Sun, X., Wu, F., et al.: Image compression with edge-based inpainting. IEEE Trans. on Circuits and Systems for Video Technology 17(10), 1273–1287 (2007)
Zhu, C.B., Sun, X.Y., Wu, F., Li, H.Q.: Video coding with spatio-temporal texture synthesis and edge-based inpainting. In: IEEE International Conference on Multimedia and Expo., June 2008, pp. 812–816 (2008)
Ndjiki-Nya, P., Hinz, T., Wiegand, T.: Generic and robust video coding with texture analysis and synthesis. In: IEEE International Conference on Multimedia, July 2007, pp. 1447–1450 (2007)
Dumitras, A., Haskell, B.G.: An Encoder-Decoder texture replacement method with application to content-based method. IEEE Trans. on Circuits and Systems for Video Technology 14(6), 825–840 (2004)
Criminisi, A., Pérez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. on Image Process 13(9), 1200–1212 (2004)
Komodakis, N., Tziritas, G.: Image completion using efficient belief propagation via priority scheduling and dynamic pruning. IEEE Trans. on Image Processing 16(11), 2649–2661 (2007)
Li, Y., Sun, X.Y., Xiong, H.K., Wu, F.: Incorporating primal sketch based learning low bit-rate image compression. In: IEEE International Conference on Image Processing, vol. 3, pp. 173–176 (2007)
Jun, X.J., Wu, X.L.: Can low resolution be better? In: Data Compression Conference, pp. 302–311 (2008)
Li, X., Orchard, M.: New edge directed interpolation. IEEE Trans. on Image Processing 10, 1521–1527 (2001)
Egiazarian, K., Foi, A., Katkovnik, V.: Compressed sensing image reconstruction via recursive spatially adaptive filtering. In: IEEE International Conference on Image Processing, San Antonio, TX, USA, September 2007, vol. 1 (2007)
Yuan, Z., Xiong, H.K., Song, L., Zheng, Y.F.: Generic video coding with abstraction and completion. In: IEEE International Conference on Acoustic, Speech and Signal Processing (April 2009)
Wexler, Y., Shechtman, E., Irani, M.: Spatio-temporal completion of video. IEEE Trans. on Pattern Analysis and Machine Intelligence 29(3), 463–476 (2007)
Wang, Y.-P., Lee, S.L., Toraichi, K.: Multiscale curvature-Based Shape Representation Using B-Spline Wavelets. IEEE Transactions on Image Processing 8(11), 1568–1592 (1999)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–104 (2004)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, January 1998, pp. 839–846 (1998)
Orchard, M.T.: Overlapped block motion compression: an estimation theoretic approach. IEEE Trans. on Image Processing 3(5), 693–699 (1994)
Hornik, K., Stinchcombe, White, H.: Multi-layer feedfoward networks are Universal approximaters. Neural Networks 2, 259–266 (1989)
Chou, C.H., Li, Y.C.: A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. IEEE Trans. on Circuits and Systems for Video Technology 5(6), 467–476 (1995)
Bouguet, J.Y.: Pyramidal Implementation of the Lucas Kanade Feature Tracker: Description of the Algorithm. Intel Research Laboratory Technical Report (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xiong, H., Yuan, Z., Xu, Y. (2009). A Learning-Based Framework for Low Bit-Rate Image and Video Coding. In: Muneesawang, P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X. (eds) Advances in Multimedia Information Processing - PCM 2009. PCM 2009. Lecture Notes in Computer Science, vol 5879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10467-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-10467-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10466-4
Online ISBN: 978-3-642-10467-1
eBook Packages: Computer ScienceComputer Science (R0)