Skip to main content

Learning Segmentation of Documents with Complex Scripts

  • Conference paper
Computer Vision, Graphics and Image Processing

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4338))

  • 1904 Accesses

Abstract

Most of the state-of-the-art segmentation algorithms are designed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the apriori information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book) and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla, Kannada, Malayalam, and Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Shafait, F., Keysers, D., Breuel, T.M.: Performance comparison of six algorithms for page segmentation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 368–379. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. O’Gorman, L.: The Document Spectrum for Page Layout Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1162–1173 (1993)

    Article  Google Scholar 

  3. Kise, K., Sato, A., Iwata, M.: Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding 70, 370–382 (1998)

    Article  Google Scholar 

  4. Nagy, G., Seth, S., Viswanathan, M.: A Prototype Document Image Analysis System for Technical Journals. Computer 25, 10–22 (1992)

    Article  Google Scholar 

  5. Baird, H.S., Jones, S.E., Fortune, S.J.: Image segmentation by shape-directed covers. In: Proceedings of International Conference on Pattern Recognition(ICPR), pp. 820–825 (1990)

    Google Scholar 

  6. Pavlidis, T., Zhou, J.: Page Segmentation and Classification. Graphical Models and Image Processing 54, 484–496 (1992)

    Article  Google Scholar 

  7. Ambati, V., Balakrishnan, N.: Reddy, R., Pratha, L., Jawahar, C.V.: The Digital Library of India Project: Process, Policies and Architecture. In: Second International Conference on Digital Libraries(ICDL) (2006)

    Google Scholar 

  8. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence 23, 1222–1239 (2001)

    Article  Google Scholar 

  9. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)

    Article  Google Scholar 

  10. Shental, N., Zomet, A., Hertz, T., Weiss, Y.: Learning and inferring image segmentations using the GBP typical cut algorithm. In: International Conference in Computer Vision, pp. 1243–1250 (2003)

    Google Scholar 

  11. Kumar, M.P., Torr, P.H.S., Zisserman, A.: OBJ CUT. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 18–25 (2005)

    Google Scholar 

  12. Baird, H.S.: The skew angle of printed documents. In: Document Image Analysis, pp. 204–208. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  13. Yan, H.: Skew correction of document images using interline cross-correlation. CVGIP: Graphical Models Image Processing 55, 538–543 (1993)

    Article  Google Scholar 

  14. Kumar, K.S.S., Namboodiri, A.M., Jawahar, C.V.: Learning to segment document images. In: Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence (PReMI), pp. 471–476 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sesh Kumar, K.S., Namboodiri, A.M., Jawahar, C.V. (2006). Learning Segmentation of Documents with Complex Scripts. In: Kalra, P.K., Peleg, S. (eds) Computer Vision, Graphics and Image Processing. Lecture Notes in Computer Science, vol 4338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11949619_67

Download citation

  • DOI: https://doi.org/10.1007/11949619_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68301-8

  • Online ISBN: 978-3-540-68302-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics