Résumé

In this paper, we present an unsupervised feature learning method for page segmentation of historical handwritten documents available as color images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as either periphery, background, text block, or decoration. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we apply convolutional autoencoders to learn features directly from pixel intensity values. Then, using these features to train an SVM, we achieve high quality segmentation without any assumption of specific topologies and shapes. Experiments on three public datasets demonstrate the effectiveness and superiority of the proposed approach.

Détails

Actions