Segmentation-free speech text recognition for comic books - La Rochelle Université Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

Segmentation-free speech text recognition for comic books

Reconnaissance du texte de bandes dessinées sans segmentation

Résumé

Speech text in comic books is written in a particular manner by the scriptwriter which raises unusual challenges for text recognition. We first detail these challenges and present different approaches to solve them. We compare the performances of pre-trained OCR and segmentation-free approach for speech text of comic books written in Latin script. We demonstrate that few good quality pre-trained OCR output samples, associated with other unlabeled data with the same writing style, can feed a segmentation-free OCR and improve text recognition. Thanks to the help of the lexi-cality measure that automatically accept or reject the pre-trained OCR output as pseudo ground truth for a subsequent segmentation-free OCR training and recognition.
Fichier principal
Vignette du fichier
segmentation-free-speech.pdf (261.81 Ko) Télécharger le fichier
Loading...

Dates et versions

hal-01719619 , version 1 (02-03-2018)

Identifiants

Citer

Christophe Rigaud, Jean-Christophe Burie, Jean-Marc Ogier. Segmentation-free speech text recognition for comic books. 2nd International Workshop on coMics Analysis, Processing, and Understanding (MANPU), Nov 2017, Kyoto, Japan. ⟨10.1109/ICDAR.2017.288⟩. ⟨hal-01719619⟩

Collections

L3I UNIV-ROCHELLE
64 Consultations
398 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More