Layout analysis and text column segmentation for historical vietnamese steles

Scius-Bertrand, Anna (Ecole Pratique des Hautes Etudes, PSL, Paris, France ; School of Engineering and Architecture (HEIA-FR), HES-SO // University of Applied Sciences Western Switzerland) ; Voetglin, Lars (DIVA, University of Fribourg, Switzerland) ; Alberti, Michele (DIVA, University of Fribourg, Switzerland) ; Fischer, Andreas (School of Engineering and Architecture (HEIA-FR), HES-SO // University of Applied Sciences Western Switzerland ; DIVA, University of Fribourg, Fribourg, Switzerland) ; Bui, Marc (Ecole Pratique des Hautes Etudes, PSL, Paris, France)

Stone engravings in Historical Vietnamese steles allow historians to study the life of common people in the villages. Only recently, a large amount of images of such engravings have become available. For supporting the historians, automatic document analysis systems are needed for reading the ancient Chu Nôm characters that are written in columns from top to bottom. In this paper, we study the problem of layout analysis, which is the first step of automatic reading. Semantic segmentation is applied at pixel-level to find the title, main text, label, and reference number on the page using deep convolutional neural networks. Afterwards, seam carving is used to segment the text columns within the main text. We present baseline results for hundred exemplary pages, discuss error cases, and outline lines of future research.


Conference Type:
full paper
Faculty:
Ingénierie et Architecture
School:
HEIA-FR
Institute:
iCoSys - Institut des systèmes complexes
Publisher:
Sydney, Australia, 20-21 September 2019
Date:
2019-09
Sydney, Australia
20-21 September 2019
Pagination:
6 p.
Published in:
HIP '19 : Proceedings of the 5th International Workshop on Historical Document Imaging and Processing, 20-21 September 2019, Sydney, Australia
Numeration (vol. no.):
pp. 84-89
DOI:
Appears in Collection:



 Record created 2020-01-17, last modified 2020-01-21

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)