Accueil || Parcours || Recherche || S'enregistrer || Mon Compte || Contacts || Aide || Langues
Saidane, Zohra (2008) Reconnaissance de texte dans les images et les vidéos en utilisant les réseaux de neurones à convolutions. Doctorat Signal et Image, TSI, ENST p.177.
Plein texte disponible en tant que :
|
|
Résumé
Thanks to increasingly powerful storage media, multimedia resources have become nowadays essential resources, in the field of information and broadcasting (News Agency, INA), culture (museums), transport (monitoring), environment (satellite images), or medical imaging (medical records in hospitals).
Thus, the challenge is how to quickly find relevant information. Therefore, research in multimedia is increasingly focused on indexing and retrieval techniques. To accomplish this task, the text within images and videos can be a relevant key.
The challenges of recognizing text in images and videos are many: poor resolution, characters of different sizes, artifacts due to compression and effects of anti-recovery, very complex and variable background.
There are four steps for the recognition of the text: (1) detecting the presence of the text, (2) localizing of the text, (3) extracting and enhancing the text area, and finally (4) recognizing the content of the text.
In this work we will focus on this last step and we assume that the text box has been detected, located and retrieved correctly. This recognition module can also be divided into several sub-modules such as a binarization module, a text segmentation module, a character recognition module.
We focused on a particular machine learning algorithm called convolutional neural networks (CNNs). These are networks of neurons whose topology is similar to the mammalian visual cortex. CNNs were initially used for recognition of handwritten digits. They were then applied successfully on many problems of pattern recognition.
We propose in this thesis a new method of binarization of text images, a new method for segmentation of text images, the study of a convolutional neural network for character recognition in images, a discussion on the relevance of the binarization step in the recognition of text in images based on machine learning methods, and a new method of text recognition in images based on graph theory.
| Type d'EPrint: | Thèse (Doctorat) |
|---|---|
| Directeur de Thèse: | Garcia, Christophe et Dugelay, Jean Luc |
| Date: | 16 Décembre 2008 |
| Jury de Thèse: | Ghorbel, Faouzi et Lezoray, Olivier et Maruani, Alain |
| Ecole Doctorale: | ED 130 INFORMATIQUE, TELECOMMUNICATIONS ET ELECTRONIQUE (EDITE) |
| Discipline: | Signal et Image |
| Fonds: | TELECOM ParisTech (ENST) |
| Institution: | ENST |
| Laboratoire: | TSI |
| Sujets: | 2. Sciences et technologies de l'information et de la communication |
| Mots-clés libres: | Text Recognition, Neural Networks |
| Code ID: | 4685 |
| Déposé par : | Zohra Saidane |
| Déposé le : | 22 Juin 2009 |
Table des Matières
French Summary
1 Introduction 1
2 State of the art 16
3 Convolutional Neural Networks 33
4 Video Text Binarization 47
5 Video Text Segmentation 66
6 Character Recognition in Text Images 86
7 Discussion on the binarization step 103
8 The iTRG - image Text Recognition Graph 112
9 Conclusion and Perspectives
Administrateurs de l'archive uniquement : éditer cet enregistrement