Abstract
Most of the state-of-the-art segmentation algorithms are designed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the apriori information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book) and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla, Kannada, Malayalam, and Urdu.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Shafait, F., Keysers, D., Breuel, T.M.: Performance comparison of six algorithms for page segmentation. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 368–379. Springer, Heidelberg (2006)
O’Gorman, L.: The Document Spectrum for Page Layout Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 15, 1162–1173 (1993)
Kise, K., Sato, A., Iwata, M.: Segmentation of Page Images Using the Area Voronoi Diagram. Computer Vision and Image Understanding 70, 370–382 (1998)
Nagy, G., Seth, S., Viswanathan, M.: A Prototype Document Image Analysis System for Technical Journals. Computer 25, 10–22 (1992)
Baird, H.S., Jones, S.E., Fortune, S.J.: Image segmentation by shape-directed covers. In: Proceedings of International Conference on Pattern Recognition(ICPR), pp. 820–825 (1990)
Pavlidis, T., Zhou, J.: Page Segmentation and Classification. Graphical Models and Image Processing 54, 484–496 (1992)
Ambati, V., Balakrishnan, N.: Reddy, R., Pratha, L., Jawahar, C.V.: The Digital Library of India Project: Process, Policies and Architecture. In: Second International Conference on Digital Libraries(ICDL) (2006)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence 23, 1222–1239 (2001)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)
Shental, N., Zomet, A., Hertz, T., Weiss, Y.: Learning and inferring image segmentations using the GBP typical cut algorithm. In: International Conference in Computer Vision, pp. 1243–1250 (2003)
Kumar, M.P., Torr, P.H.S., Zisserman, A.: OBJ CUT. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pp. 18–25 (2005)
Baird, H.S.: The skew angle of printed documents. In: Document Image Analysis, pp. 204–208. IEEE Computer Society Press, Los Alamitos (1995)
Yan, H.: Skew correction of document images using interline cross-correlation. CVGIP: Graphical Models Image Processing 55, 538–543 (1993)
Kumar, K.S.S., Namboodiri, A.M., Jawahar, C.V.: Learning to segment document images. In: Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence (PReMI), pp. 471–476 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sesh Kumar, K.S., Namboodiri, A.M., Jawahar, C.V. (2006). Learning Segmentation of Documents with Complex Scripts. In: Kalra, P.K., Peleg, S. (eds) Computer Vision, Graphics and Image Processing. Lecture Notes in Computer Science, vol 4338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11949619_67
Download citation
DOI: https://doi.org/10.1007/11949619_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68301-8
Online ISBN: 978-3-540-68302-5
eBook Packages: Computer ScienceComputer Science (R0)