|
| 1 | + LayoutParser : A Unified Toolkit for Deep |
| 2 | + Learning Based Document Image Analysis |
| 3 | + |
| 4 | + |
| 5 | +Zejiang Shen 1 ( ), Ruochen Zhang 2, Melissa Dell 3, Benjamin Charles Germain |
| 6 | + Lee 4, Jacob Carlson 3, and Weining Li 5 |
| 7 | + |
| 8 | + 1 Allen Institute for AI |
| 9 | + shannons@allenai.org |
| 10 | + 2 Brown University |
| 11 | + ruochen zhang@brown.edu |
| 12 | + 3 Harvard University |
| 13 | + {melissadell,jacob carlson }@fas.harvard.edu |
| 14 | + 4 University of Washington |
| 15 | + bcgl@cs.washington.edu |
| 16 | + 5 University of Waterloo |
| 17 | + w422li@uwaterloo.ca |
| 18 | + |
| 19 | + |
| 20 | + |
| 21 | + Abstract. Recentadvancesindocumentimageanalysis(DIA)havebeen |
| 22 | + primarily driven by the application of neural networks. Ideally, research |
| 23 | + outcomes could be easily deployed in production and extended for further |
| 24 | + investigation. However, various factors like loosely organized codebases |
| 25 | + and sophisticated model configurations complicate the easy reuse of im- |
| 26 | + portant innovations by awide audience. Though there havebeen on-going |
| 27 | + efforts to improve reusability and simplify deep learning (DL) model |
| 28 | + development in disciplines like natural language processing and computer |
| 29 | + vision, none of them are optimized for challenges in the domain of DIA. |
| 30 | + This represents a major gap in the existing toolkit, as DIA is central to |
| 31 | + academic research across a wide range of disciplines in the social sciences |
| 32 | + and humanities. This paper introduces LayoutParser , an open-source |
| 33 | + library for streamlining the usage of DL in DIA research and applica- |
| 34 | + tions. The core LayoutParser library comes with a set of simple and |
| 35 | + intuitive interfaces for applying and customizing DL models for layout de- |
| 36 | + tection,characterrecognition,andmanyotherdocumentprocessingtasks. |
| 37 | + To promote extensibility, LayoutParser also incorporates a community |
| 38 | + platform for sharing both pre-trained models and full document digiti- |
| 39 | + zation pipelines. We demonstrate that LayoutParser is helpful for both |
| 40 | + lightweight and large-scale digitization pipelines in real-word use cases. |
| 41 | + The library is publicly available at https://layout-parser.github.io . |
| 42 | + |
| 43 | + Keywords: DocumentImageAnalysis ·DeepLearning ·LayoutAnalysis |
| 44 | + · Character Recognition · Open Source library · Toolkit. |
| 45 | + |
| 46 | +1 Introduction |
| 47 | + |
| 48 | +Deep Learning(DL)-based approaches are the state-of-the-art for a wide range of |
| 49 | +documentimageanalysis(DIA)tasksincludingdocumentimageclassification[ 11 , |
0 commit comments