Selecting Autoencoder Features for Layout Analysis of Historical Documents

Wei, Hao; Seuret, Mathias; Chen, Kai; Fischer, Andreas; Liwicki, Marcus; Ingold, Rolf

doi:10.1145/2809544.2809548

Wei, Hao; Seuret, Mathias; Chen, Kai; Fischer, Andreas; Liwicki, Marcus; Ingold, Rolf

2015

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

Automatic layout analysis of historical documents has to cope with a large number of different scripts, writing supports, and digitalization qualities. Under these conditions, the design of robust features for machine learning is a highly challenging task. We use convolutional autoencoders to learn features from the images. In order to increase the classification accuracy and to reduce the feature dimension, in this paper we propose a novel feature selection method. The method cascades adapted versions of two conventional methods. Compared to three conventional methods and our previous work, the proposed method achieves a higher classification accuracy in most cases, while maintaining low feature dimension. In addition, we find that a significant number of autoencoder features are redundant or irrelevant for the classification, and we give our explanations. To the best of our knowledge, this paper is one of the first investigations in the field of image processing on the detection of redundancy and irrelevance of autoencoder features using feature selection.

Détails

Titre

Selecting Autoencoder Features for Layout Analysis of Historical Documents

Auteur(s)/ trice(s)

Wei, Hao (University of Fribourg, Fribourg, Switzerland)
Seuret, Mathias (University of Fribourg, Fribourg, Switzerland)
Chen, Kai (University of Fribourg, Fribourg, Switzerland)
Fischer, Andreas (University of Fribourg, Fribourg, Switzerland ; School of Engineering and Architecture (HEIA-FR), HES-SO University of Applied Sciences Western Switzerland)
Liwicki, Marcus (University of Fribourg, Fribourg, Switzerland)
Ingold, Rolf (University of Fribourg, Fribourg, Switzerland)

Date

2015-08

Publié dans

Proceedings of HIP '15: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, 22 August 2015, Nancy, France

Volume

pp.55-62

Publié par

Nancy, France, 22 August 2015

Pagination

8 p.

Présenté à

Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, Nancy, France, 2015-08-22, 2015-08-22

ISBN

9781450336024

DOI

https://doi.org/10.1145/2809544.2809548

Mots-clés (libres)

historical documents ; layout analysis ; autoencoders ; selected features ; accuracy ; feature dimension

Type de papier

published full paper

Domaine

Ingénierie et Architecture

Ecole

HEIA-FR

Institut

iCoSys- Institut d’intelligence artificielle et systèmes complexes

Le document apparaît dans

Documents de conférences
Global

Résumé

Détails

Actions