Résumé

One of the main challenges of automatically transcribing large collections of handwritten letters is to cope with the high variability of writing styles present in the collection. In particular, the writing styles of non-frequent writers, who have contributed only few letters, are often missing in the annotated learning samples used for training handwriting recognition systems. In this paper, we introduce the Bullinger dataset for writer adaptation, which is based on the Heinrich Bullinger letter collection from the 16th century, using a subset of 3,622 annotated letters (about 1.2 million words) from 306 writers. We provide baseline results for handwriting recognition with modern recognizers, before and after the application of standard techniques for supervised adaptation of frequent writers and self-supervised adaptation of non-frequent writers.

Détails

Actions

PDF