Abstract
Recognition of old Greek manuscripts is essential for quick and efficient content exploitation of the valuable old Greek historical collections. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts written in lower case letters. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures. First, we detect closed cavities that exist in the character body. Then, the protrusions in the outer contour outline of the connected components that contain the character closed cavities are used for the classification of the area around closed cavities to a specific character or a character ligature. The proposed method gives highly accurate results and offers great assistance to old Greek handwritten manuscript OCR. We also provide additional OCR applications that not only prove the robustness of the proposed method but also demonstrate its generic flavor in case segmentation and text location tasks are very difficult to perform.

















Similar content being viewed by others
References
Vinciarelli A (2002) survey on off-line Cursive Word Recognition. Pattern Recognition 35:1433–1446
Lu Y, Tan CL (2002) Combination of multiple classifiers using probabilistic dictionary and its application to postcode recognition. Pattern Recognition 35:2823–2832
Brakensiek A, Rottland J, Rigoll G (2003) Confidence measures for an address reading system. Seventh international conference on document analysis and recognition, ICDAR2003, pp 294–298
Hirano T, Okada Y, Yoda F (2001) Field extraction method from existing forms transmitted by facsimile. Sixth international conference on document analysis and recognition, ICDAR2001, pp 738–742
Xu Q, Lam L, Suen CY (2001) A knowledge-based segmentation system for handwritten dates on bank cheques. Sixth international conference on document analysis and recognition, ICDAR2001, pp 384–388
Gorski N, Anisimov V, Augustin E, Baret O, Price D, Simon JC (1999) A2iA check reader: a family of bank check recognition systems. Proc. fifth int’l conf. document analysis and recognition, pp 523–526
Suen CY, et al (1993) Building a new generation of handwriting recognition systems. Patt Recog Lett 14:303–315
Guillevic D, Suen CY (1997) HMM word recognition engine. Fourth international conference on document analysis and recognition ICDAR97, pp 544
Kavallieratou E, Fakotakis N, Kokkinakis G (2002) Handwritten character recognition based on structural characteristics. 16th International conference on pattern recognition, pp 139–142
Eastwood B et al. (1997) A feature based neural network segmenter for handwritten words. International conference on computational intelligence and multimedia applications (ICCIMA’97), Australia, pp 286–290
Lu Y, Shridhar M (1996) Character segmentation in handwritten words—an overview, Patt Recog 29(1):77–96
Xiao X, Leedham G (1999) Cursive script segmentation incorporating knowledge of writing. Proceedings of the fifth international conference on document analysis and recognition, pp 535–538
Plamondon P, Privitera CM (1999) The segmentation of cursive handwritten: an approach based on off-line recovery of the motor-temporal information, IEEE Trans Image Process 8:80–91
Chi Z, Suters M, Yan H (1995) Separation of single-and double-touching handwritten numeral strings. Opt Eng 34:1159–1165
Zhao S, Chi Z, Shi P, Yan H (2003) Two-stage segmentation of unconstrained handwritten Chinese characters. Pattern Recognition 36:145–156
Farag R (1979) Word-level recognition of cursive script, IEEE Trans. Comput Vol C-28:172–175
Simon J (1992) Off-line cursive word recognition. Proceedings of the IEEE 80:1150–1161
Madhvanath S, Govindaraju V (1993) Holistic lexicon reduction. Proceedings of the Third International Workshop on Frontiers in Handwriting Recognition. Buffalo, N.Y:71–82
Madhvanath S, Kleinger E, Govindaraju V (1999) Holistic verifications of handwritten phrases. IEEE Trans. PAMI 21:1344–1356
Chen CH, de Curtins J (2003) Word Recognition in a Segmentation-Free Approach to OCR. Second International Conference on Document Analysis and Recognition (ICDAR’93), pp 573–576
Chen CH, de Curtins J (1992) A Segmentation-free Approach to OCR. IEEE Workshop on Applications of Computer Vision, pp 190–196
Duda R, Hart E (1973) Pattern Classification and Scene Analysis. Wiley
Amin A and Masini G Machine recognition of cursive Arabic words, Application of Digital Image Processing IV, San Diego, CA, August 1982, Vol SPIE-359, pp.286–292]
Mori S, Suen CY, Yamamoto K Historical review of OCR research and development, Proc. IEEE, vol. 80 1992, pp. 1029–1058
Ulmann J. R. Experiments with the n-tuple method of pattern recognition, IEEE Trans. Computers, vol 18, no 12,1969 pp. 1135–1137
Jung DM, Krishnamoorty MS, Nagy G, Shapira A. N-tuple features for OCR revisited, IEEE Trans. PAMI vol. 18, no. 7,1996, pp. 734–745
Gonzalez RC, Woods RE (1992) Digital Image Processing. Addison-Wesley
Gatos B, Pratikakis I, Perantonis SJ Locating Text in Historical Collection Manuscripts. Lecture Notes on AI, SETN 2004, pp. 476–485
Niblack W (1986) An Introduction to Digital Image Processing. Prentice Hall, Englewood Cliffs NJ, pp 115–116
Pavlidis T (1992) Algorithms for Graphics and Image Processing. Computer Science Press, Rockville, MD
Xia F (2003) Normal vector and winding number in 2D digital images with their application for hole detection. Pattern Recognition 36:1383–1395
Jain A (1989) Fundamentals of digital image processing. Prentice Hall
Theodoridis S, Koutroumbas K (1997) Pattern Recognition. Academic Press
Chang CC, Lin, C. J. LIBSVM: A library for support vector machines 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
American Memory: Historical Collections for the National Digital Library, http://memory.loc.gov/
Sauvola J, Kauniskangas H (1999) MediaTeam Document Database II, a CD-ROM collection of document images. University of Oulu, Finland
Acknowledgements
This research is carried out within the framework of the Greek GSRT-funded R&D project, D-SCRIBE, which aims to develop an integrated system for digitization and processing of old Greek manuscripts.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gatos, B., Ntzios, K., Pratikakis, I. et al. An efficient segmentation-free approach to assist old Greek handwritten manuscript OCR. Pattern Anal Applic 8, 305–320 (2006). https://doi.org/10.1007/s10044-005-0013-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-005-0013-7