Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Zayene, Oussama (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia ; DIVA Group, University of Fribourg, Switzerland) ; Touj, Sameh Masmoudi (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia) ; Hennebert, Jean (School of Engineering and Architecture (HEIA-FR), HES-SO // University of Applied Sciences Western Switzerland) ; Ingold, Rolf (DIVA Group, University of Fribourg, Switzerland) ; Essoukri Ben Amara, Najoua (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia)

This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.


Article Type:
scientifique
Faculty:
Ingénierie et Architecture
School:
HEIA-FR
Institute:
iCoSys - Institut des systèmes complexes
Subject(s):
Ingénierie
Date:
2018-08
Pagination:
10 p.
Published in:
IET Computer Vision
Numeration (vol. no.):
2018, vol. 12, no. 5, pp. 710-719
DOI:
ISSN:
1751-9632
Appears in Collection:

Note: The status of this file is: restricted


 Record created 2019-02-26, last modified 2019-03-05

Fulltext:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)