Learning Segmentation of Documents with Complex Scripts

Sesh Kumar, K. S.; Namboodiri, Anoop M.; Jawahar, C. V.

doi:10.1007/11949619_67

K. S. Sesh Kumar¹⁸,
Anoop M. Namboodiri¹⁸ &
C. V. Jawahar¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4338))

1904 Accesses

Abstract

Most of the state-of-the-art segmentation algorithms are designed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the apriori information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book) and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla, Kannada, Malayalam, and Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Handwritten Text Line Segmentation Based on Structural Features

Line, Word, and Character Segmentation from Bangla Handwritten Text—A Precursor Toward Bangla HOCR

A fast hierarchical method for multi-script and arbitrary oriented scene text extraction

Article 24 September 2016

References

Shafait, F., Keysers, D., Breuel, T.M.: Performance comparison of six algorithms for page segmentation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 368–379. Springer, Heidelberg (2006)
Chapter Google Scholar
O’Gorman, L.: The Document Spectrum for Page Layout Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1162–1173 (1993)
Article Google Scholar
Kise, K., Sato, A., Iwata, M.: Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding 70, 370–382 (1998)
Article Google Scholar
Nagy, G., Seth, S., Viswanathan, M.: A Prototype Document Image Analysis System for Technical Journals. Computer 25, 10–22 (1992)
Article Google Scholar
Baird, H.S., Jones, S.E., Fortune, S.J.: Image segmentation by shape-directed covers. In: Proceedings of International Conference on Pattern Recognition(ICPR), pp. 820–825 (1990)
Google Scholar
Pavlidis, T., Zhou, J.: Page Segmentation and Classification. Graphical Models and Image Processing 54, 484–496 (1992)
Article Google Scholar
Ambati, V., Balakrishnan, N.: Reddy, R., Pratha, L., Jawahar, C.V.: The Digital Library of India Project: Process, Policies and Architecture. In: Second International Conference on Digital Libraries(ICDL) (2006)
Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence 23, 1222–1239 (2001)
Article Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)
Article Google Scholar
Shental, N., Zomet, A., Hertz, T., Weiss, Y.: Learning and inferring image segmentations using the GBP typical cut algorithm. In: International Conference in Computer Vision, pp. 1243–1250 (2003)
Google Scholar
Kumar, M.P., Torr, P.H.S., Zisserman, A.: OBJ CUT. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 18–25 (2005)
Google Scholar
Baird, H.S.: The skew angle of printed documents. In: Document Image Analysis, pp. 204–208. IEEE Computer Society Press, Los Alamitos (1995)
Google Scholar
Yan, H.: Skew correction of document images using interline cross-correlation. CVGIP: Graphical Models Image Processing 55, 538–543 (1993)
Article Google Scholar
Kumar, K.S.S., Namboodiri, A.M., Jawahar, C.V.: Learning to segment document images. In: Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence (PReMI), pp. 471–476 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
K. S. Sesh Kumar, Anoop M. Namboodiri & C. V. Jawahar

Authors

K. S. Sesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Anoop M. Namboodiri
View author publications
You can also search for this author in PubMed Google Scholar
C. V. Jawahar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, IIT Delhi, New Delhi, India
Prem K. Kalra
School of Computer Science and Engineering, The Hebrew University of Jerusalem, 91904, Jerusalem, Israel
Shmuel Peleg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sesh Kumar, K.S., Namboodiri, A.M., Jawahar, C.V. (2006). Learning Segmentation of Documents with Complex Scripts. In: Kalra, P.K., Peleg, S. (eds) Computer Vision, Graphics and Image Processing. Lecture Notes in Computer Science, vol 4338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11949619_67

Download citation

DOI: https://doi.org/10.1007/11949619_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68301-8
Online ISBN: 978-3-540-68302-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics