ParisTech se présente
 Evénements
 
 Etudier à ParisTech
 La coopération internationale
 Ressources documentaires
 Vivre à ParisTech
 ParisTech et les entreprises
 ParisTech Libres Savoirs
 
 

Conception et mise en oeuvre d'algorithmes de vision temps-réel pour la vidéo surveillance intelligente.

Accueil || Parcours || Recherche || S'enregistrer || Mon Compte || Contacts || Aide || Langues

Ghorayeb, Hicham (2007) Conception et mise en oeuvre d'algorithmes de vision temps-réel pour la vidéo surveillance intelligente. Doctorat Informatique temps réel, robotique et automatique, CAOR- Centre de robotique, ENSMP p.197.

Plein texte disponible en tant que :

- GhorayebThesis.pdf ( 11380 Kb )
Licence: Copyright

Résumé

Notre objectif est d'étudier les algorithmes de vision utilisés aux différents niveaux dans une chaîne de traitement vidéo intelligente. On a prototypé une chaîne de traitement générique dédiée à l'analyse du contenu du flux vidéo. En se basant sur cette chaîne de traitement, on a développé une application de détection et de suivi de piétons. Cette application est une partie intégrante du projet PUVAME.

Cette chaîne de traitement générique est composée de plusieurs étapes: détection, classification et suivi d'objets. D'autres étapes de plus haut niveau sont envisagées comme la reconnaissance d'actions, l'identification, la description sémantique ainsi que la fusion des données de plusieurs caméras. On s'est intéressé aux deux premières étapes. On a exploré des algorithmes de segmentation du fond dans un flux vidéo avec caméra fixe. On a implémenté et comparé des algorithmes basés sur la modélisation adaptative du fond.

On a aussi exploré la détection visuelle d'objets basée sur l'apprentissage automatique en utilisant la technique du boosting. Cependant, On a développé une librairie intitulée LibAdaBoost qui servira comme un environnement de prototypage d'algorithmes d'apprentissage automatique. On a prototypé la technique du boosting au sein de cette librairie. On a distribué LibAdaBoost sous la licence LGPL. Cette librairie est unique avec les fonctionnalités qu'elle offre.

On a exploré l'utilisation des cartes graphiques pour l'accélération des algorithmes de vision. On a effectué le portage du détecteur visuel d'objets basé sur un classifieur généré par le boosting pour qu'il s'exécute sur le processeur graphique. On était les premiers à effectuer ce portage. On a trouvé que l'architecture du processeur graphique est la mieux adaptée pour ce genre d'algorithmes.

La chaîne de traitement a été implémentée et intégrée à l'environnement RTMaps. On a évalué ces algorithmes sur des scénarios bien définis. Ces scénarios ont été définis dans le cadre de PUVAME.

Type d'EPrint:Thèse (Doctorat)
Directeur de Thèse:Laurgeau, Claude
Date:12 Septembre 2007
Jury de Thèse:Meyrueis, Patrick et Siarry, Patrick et Akil, Mohamed et Steux, Bruno et Laurgeau, Claude et Meyer, Fernand et Schwerdt, Karl
Discipline:Informatique temps réel, robotique et automatique
Fonds:ENSMP
Institution:ENSMP
Laboratoire:CAOR- Centre de robotique
Sujets:1. Mathématiques et leurs applications
Mots-clés libres:Vidéo surveillance, Boosting, Reconnaissance automatique des formes, Système de transport intelligent, Apprentissage automatique, Détection objet en mouvement, méthode Monte Carlo
Code ID:3064
Déposé par :Claudine Abauzit
Déposé le :05 Novembre 2007

Références Bibliographiques

[Abr06] Y. Abramson. Pedestrian detection for intelligent transportation systems.

EMP Press, 2006.

[Ame03] A. Amer. Voting-based simultaneous tracking of multiple video objects.

In Proc. SPIE Int. Symposium on Electronic Imaging, pages 500–511,

2003.

[ASG05] Y. Abramson, B. Steux, and H. Ghorayeb. Yef real-time object detection.

In ALART’05:International workshop on Automatic Learning and

Real-Time, pages 5–13, 2005.

[BA04] J. Bobruk and D. Austin. Laser motion detection and hypothesis tracking

from a mobile platform. In Australasian Conference on Robotics and

Automation (ACRA), 2004.

[BER03] J. Black, T. Ellis, and P. Rosin. A novel method for video tracking

performance evaluation. In International Workshop on Visual Surveillance

and Performance Evaluation of Tracking and Surveillance, pages

125–132, 2003.

[BFH04a] I. Buck, K. Fatahalian, and Hanrahan. Gpubench: Evaluating gpu performance

for numerical and scientific applications. Proceedings of the

2004 ACM Workshop on General-Purpose Computing on Graphics Processors,

pages C–20, Aug 2004.

[BFH+04b] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Mike, and H. Pat. Brook

for gpus: Stream computing on graphics hardware, 2004.

[BMGE01] T. Boult, R. Micheals, X. Gao, and M. Eckmann. Into the woods:

visual surveillance of noncooperative and camouflaged targets in complex

outdoor settings, 2001.

[BS03] H. Ghorayeb B. Steux, Y. Abramson. Camellia image processing library,

2003.

[CG00] C.Stauffer and W.E.L. Grimson. Adaptive background mixture models

for real-time tracking. IEEE Transactions on pattern analysis and

machine intelligence, pages 747–757, August 2000.



[CL04] B. Chen and Y. Lei. Indoor and outdoor people detection and shadow

suppression by exploiting hsv color information. cit, 00:137–142, 2004.

[Cro84] F. C. Crow. Summed-area tables for texture mapping. In SIGGRAPH

’84: Proceedings of the 11th annual conference on Computer graphics

and interactive techniques, pages 207–212, New York, NY, USA, 1984.

ACM Press.

[CSW03] H. Cramer, U. Scheunert, and G.Wanielik. Multi sensor fusion for object

detection using generalized feature models. In International Conference

on Information Fusion, 2003.

[Ded04] Y Dedeoglu. Moving object detection, tracking and classification for

smart video surveillance, 2004.

[DHS00] Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification

(2nd Edition). Wiley-Interscience, 2000.

[DM00] D. Doermann and D. Mihalcik. Tools and techniques for video performance

evaluation. Proceedings of the International Conference on

Pattern Recognition (ICPR00), pages 4167–4170, September 2000.

[DSS93] H. Drucker, R. Schapire, and P. Simard. Boosting performance in neural

networks. International Journal of Pattern Recognition and Artificial

Intelligence, 7(4):705–719, 1993.

[Elf89] A. Elfes. Using occupancy grids for mobile robot perception and navigation.

Computer, 22:46–57, June 1989.

[EWN04] Magnus Ekman, Fredrik Warg, and Jim Nilsson. An in-depth look at

computer performance growth. Technical Report 04-9, Department of

Computer Science and Engineering, Chalmers University of Technology,

2004.

[FH06] J.P. Farrugia and P. Horain. Gpucv: A framework for image processing

acceleration with graphics processors. In International Conference on

Multimedia and Expo (ICME), Toronto, Ontario, Canada, July 9–12

2006.

[FM05] J. Fung and S. Mann. Openvidia: parallel gpu computer vision. In

MULTIMEDIA ’05: Proceedings of the 13th annual ACM international

conference on Multimedia, pages 849–852, New York, NY, USA, 2005.

ACM Press.

[Fre90] Y. Freund. Boosting a weak learning algorithm by majority. In Proceedings

of the Third Annual Workshop on Computational Learning Theory,

pages 202–216, August 1990.



[FS95a] Y. Freund and R. E. Schapire. A decision-theoretic generalization of

on-line learning and an application to boosting. In European Conference

on Computational Learning Theory, pages 23–37, 1995.

[FS95b] Y. Freund and R. E. Schapire. A decision-theoretic generalization of online

learning and an application to boosting. In Computational Learning

Theory: Second European Conference, EuroCOLT ’95, pages 23–37.

Springer-Verlag, 1995.

[FS99] Yoav Freund and R. E. Schapire. A short introduction to boosting.

Journal of Japanese Society for Artificial Intelligence, 14(5):771–780,

Sep 1999. Appearing in Japanese, translation by Naoki Abe.

[Gho06] H. Ghorayeb. Libadaboost: developer guide, 2006.

[HA05] D. Hall and Al. Comparison of target detection algorithms using adaptive

background models. INRIA Rhone-Alpes, France and IST Lisbon,

Portugal and University of Edinburgh,UK, pages 585–601, 2005.

[HBC+03] A. Hampapur, L. Brown, J. Connell, S. Pankanti, A. Senior, and Y. Tian.

Smart surveillance: Applications, technologies and implications, 2003.

[HBC06] Ghorayeb H., Steux B., and Laurgeau C. Boosted algorithms for visual

object detection on graphics processing units. In ACCV06: Asian Conference

on Computer Vision 2006, pages 254–263, Hyderabad, India,

2006.

[Hei96] F. Heijden. Image Based Measurement Systems: Object Recognition and

Parameter Estimation. Wiley, 1996.

[Int06] Intel. Intel processors product list, 2006.

[JM02] P.Perez C.Hue J.Vermaak and M.Gangnet. Color-based probabilistic

tracking. IEEE Transactions on multimedia, 2002.

[JRO99] J.Staufer, R.Mech, and J. Ostermann. Detection of moving cast shadows

for object segmentation. IEEE Transactions on multimedia, pages 65–

76, March 1999.

[JWSX02] C. Jaynes, S. Webb, R. Steele, and Q. Xiong. An open development

environment for evaluation of video surveillance systems, 2002.

[Ka04] Kenji.O and all. A boosted particle filter multi–target detection and

tracking. ICCV, 2004.

[Kal60] E. Kalman. A new approach to linear filtering and prediction problems.

Transactions of the ASME-Journal of Basic Engineering, 82:35–

45, 1960.



[KDdlF+04] M. Kais, S. Dauvillier, A. de la Fortelle, I. Masaki, and C. Laugier.

Towards outdoor localization using gis, vision system and stochastic

error propagation. In International Conference on Autonomous Robots

and Agents, December 2004.

[KNAL05] A. Khammari, F. Nashashibi, Y. Abramson, and C. Laurgeau. Vehicle

detection combining gradient analysis and adaboost classification. In

Intelligent Transportation Systems, 2005. Proceedings. 2005 IEEE, pages

66– 71, September 2005.

[KV88] M. Kearns and L. G. Valiant. Learning boolean formula or finite automate

is as hard as factoring. Technical Report TR-14-88, Harvard

University Aiken Computation Laboratory, August 1988.

[KV89] Michael Kearns and Leslie G. Valiant. Cryptographic limitations on

learning boolean formula and finite automate. In Proceedings of the

Twenty First Annual ACM Symposium on Theory of Computing, pages

433–444, May 1989.

[KV94] Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational

Learning Theory. MIT Press, 1994.

[nVI07a] nVIDIA. nvidia graphic cards, 2007.

[nVI07b] nVIDIA. Cuda programming guide: Nvidia confidential, prepared and

provided under nda, 21 nov 2007.

[OPS+97] M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio. Pedestrian

detection using wavelet templates. In Proc. Computer Vision and

Pattern Recognition, pages 193–199, June 1997.

[PEM06] T. Parag, A. Elgammal, and A. Mittal. A framework for feature selection

for background subtraction. In CVPR ’06: Proceedings of the

2006 IEEE Computer Society Conference on Computer Vision and Pattern

Recognition, pages 1916–1923, Washington, DC, USA, 2006. IEEE

Computer Society.

[RE95] P.L. Rosin and T. Ellis. Image difference threshold strategies and shadow

detection. In Proc: 6th BMVC 1995 conf., pages 347–356, 1995.

[RSR+02] M. Rochery, R. Schapire, M. Rahim, N. Gupta, G. Riccardi, S. Bangalore,

H. Alshawi, and S. Douglas. Combining prior knowledge and

boosting for call classification in spoken language dialogue. In International

Conference on Accoustics, Speech and Signal Processing, 2002.

[SAG03a] B. Steux, Y. Abramson, and H. Ghorayeb. Initial algorithms 1, delivrable

3.2b, project ist-2001-34410, public report, 2003.



[SAG03b] B. Steux, Y. Abramson, and H. Ghorayeb. Report on mapped algorithms,

delivrable 3.5, projet ist-2001-34410, internal report, 2003.

[Sch89] R. E. Schapire. The strength of weak learnability. In 30th Annual Symposium

on Foundations of Computer Science, pages 28–33, October 1989.

[SDK05] R. Strzodka, M. Doggett, and A. Kolb. Scientific computation for simulations

on programmable graphics hardware. Simulation Modelling

Practice and Theory, Special Issue: Programmable Graphics Hardware,

13(8):667–680, Nov 2005.

[SEN98] J. Steffens, E. Elagin, and H. Neven. Person spotter-fast and robust

system for human detection. In Proc. of IEEE Intl. Conf. on Automatic

Face and Gesture Recognition, pages 516–521, 1998.

[SS98] R. E. Schapire and Y. Singer. Improved boosting algorithms using

confidence-rated predictions. In Proceedings of the Eleventh Annual

Conference on Computational Learning Theory, pages 80–91, 1998. To

appear, Machine Learning.

[TDD99] T.Horprasert, D.Harwood, and L.S. Davis. A statistical approach for

real-time robust background subtraction and shadow detection. Proceedings

of International Conference on computer vision, 1999.

[TKBM99] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers. Wallflower: Principles

and practice of background maintenance. In International Conference

on Computer Vision (ICCV), volume 1, pages 255–261, 1999.

[Val84] L. G. Valiant. A theory of the learnable. Communications of the ACM,

27(11):1134–1142, November 1984.

[VJ01a] P. Viola and M. Jones. Rapid object detection using a boosted cascade

of simple features. European Conference on Computational Learning

Theory, 2001.

[VJ01b] Paul Viola and Michael Jones. Rapid object detection using a boosted

cascade of simple features. In Proceedings IEEE Conf. on Computer

Vision and Pattern Recognition, pages 511–518, 2001.

[VJJR02] V.Y.Marianoand, J.Min, JH.Park, and R.Kasturi. Performance evaluation

of object detection algorithms. In International Workshop on Visual

Surveillance and Performance Evaluation of Tracking and Surveillance,

pages 965–969, 2002.

[VJS03] Paul Viola, Michael J. Jones, and Daniel Snow. Detecting pedestrians

using patterns of motion and appearance. In IEEE International

Conference on Computer Vision, pages 734–741, Nice, France, October

2003.



[WATA97] C. Wren, A.Azarbayejani, T.Darrell, and A.Pentland. Pfinder:real-time

tracking of the human body. IEEE Transactions on pattern analysis and

machine intelligence, 19:780–785, July 1997.

[WHT03] L. Wang, W. Hu, and T. Tan. Recent developments in human motion

analysis. In Proceedings IEEE Conf. on Computer Vision and Pattern

Recognition, page 585601, 2003.

[WWT03] L. Wang, W.HU, and T.Tan. Recent developments in human motion

analysis. Pattern Recognition, 36:585–601, March 2003.

[YARL06] M. Yguel, O. Aycard, D. Raulo, and C. Laugier. Grid based fusion

of offboard cameras. In IEEE International Conference on Intelligent

Vehicules, 2006.

[ZJHW06] H. Zhang, W. Jia, X. He, and Q. Wu. Learning-based license plate

detection using global and local features. In Pattern Recognition, 2006.

ICPR 2006. 18th International Conference on, pages 1102–1105, August

2006.

Table des Matières

I Introduction and state of the art 1

1 French Introduction 2

1.1 Algorithmes - 4

1.2 Architecture - 4

1.3 Application - 5

2 Introduction 6

2.1 Contributions - 6

2.2 Outline - 7

3 Intelligent video surveillance systems (IVSS) 9

3.1 What is IVSS? - 10

3.1.1 Introduction - 10

3.1.2 Historical background - 10

3.2 Applications of intelligent surveillance - 11

3.2.1 Real time alarms - 11

3.2.2 User defined alarms - 11

3.2.3 Automatic unusual activity alarms - 12

3.2.4 Automatic Forensic Video Retrieval (AFVR) - 12

3.2.5 Situation awareness - 13

3.3 Scenarios and examples - 13

3.3.1 Public and commercial security - 13

3.3.2 Smart video data mining - 14

3.3.3 Law enforcement - 14

3.3.4 Military security - 14

3.4 Challenges - 14

3.4.1 Technical aspect - 14

3.4.2 Performance evaluation - 15

3.5 Choice of implementation methods - 15

3.6 Conclusion - 16



II Algorithms 17

4 Generic framework for intelligent visual surveillance 18

4.1 Object detection - 19

4.2 Object classification - 19

4.3 Object tracking - 20

4.4 Action recognition - 20

4.5 Semantic description - 21

4.6 Personal identification - 21

4.7 Fusion of data from multiple cameras - 21

5 Moving object detection 22

5.1 Challenge of detection - 23

5.2 Object detection system diagram - 25

5.2.1 Foreground detection - 26

5.2.2 Pixel level post-processing (Noise removal) - 27

5.2.3 Detecting connected components - 28

5.2.4 Region level post-processing - 28

5.2.5 Extracting object features - 28

5.3 Adaptive background differencing - 29

5.3.1 Basic Background Subtraction (BBS) - 29

5.3.2 W4 method - 30

5.3.3 Single Gaussian Model (SGM) - 31

5.3.4 Mixture Gaussian Model (MGM) - 32

5.3.5 Lehigh Omni-directional Tracking System (LOTS): - 33

5.4 Shadow and light change detection - 35

5.4.1 Methods and implementation - 36

5.5 High level feedback to improve detection methods - 46

5.5.1 The modular approach - 47

5.6 Performance evaluation - 48

5.6.1 Ground truth generation - 48

5.6.2 Datasets - 48

5.6.3 Evaluation metrics - 49

5.6.4 Experimental results - 54

5.6.5 Comments on the results - 56

6 Machine learning for visual object-detection 57

6.1 Introduction - 58

6.2 The theory of boosting - 58

6.2.1 Conventions and definitions - 58

6.2.2 Boosting algorithms - 60

6.2.3 AdaBoost - 61

6.2.4 Weak classifier - 63

6.2.5 Weak learner - 63



6.3 Visual domain - 66

6.3.1 Static detector - 67

6.3.2 Dynamic detector - 68

6.3.3 Weak classifiers - 68

6.3.4 Genetic weak learner interface - 75

6.3.5 Cascade of classifiers - 76

6.3.6 Visual finder - 77

6.4 LibAdaBoost: Library for Adaptive Boosting - 80

6.4.1 Introduction - 80

6.4.2 LibAdaBoost functional overview - 81

6.4.3 LibAdaBoost architectural overview - 85

6.4.4 LibAdaBoost content overview - 86

6.4.5 Comparison to previous work - 87

6.5 Use cases - 88

6.5.1 Car detection - 89

6.5.2 Face detection - 90

6.5.3 People detection - 91

6.6 Conclusion - 92

7 Object tracking 98

7.1 Initialization - 98

7.2 Sequential Monte Carlo tracking - 99

7.3 State dynamics - 100

7.4 Color distribution Model - 100

7.5 Results - 101

7.6 Incorporating Adaboost in the Tracker - 102

7.6.1 Experimental results - 102

III Architecture 107

8 General purpose computation on the GPU 108

8.1 Introduction - 108

8.2 Why GPGPU? - 109

8.2.1 Computational power - 109

8.2.2 Data bandwidth - 110

8.2.3 Cost/Performance ratio - 112

8.3 GPGPU’s first generation - 112

8.3.1 Overview - 112

8.3.2 Graphics pipeline - 114

8.3.3 Programming language - 119

8.3.4 Streaming model of computation - 120

8.3.5 Programmable graphics processor abstractions - 121

8.4 GPGPU’s second generation - 123



8.4.1 Programming model - 124

8.4.2 Application programming interface (API) - 124

8.5 Conclusion - 125

9 Mapping algorithms to GPU 126

9.1 Introduction - 126

9.2 Mapping visual object detection to GPU - 128

9.3 Hardware constraints - 130

9.4 Code generator - 131

9.5 Performance analysis - 133

9.5.1 Cascade Stages Face Detector (CSFD) - 133

9.5.2 Single Layer Face Detector (SLFD) - 134

9.6 Conclusion - 135

IV Application 137

10 Application: PUVAME 138

10.1 Introduction - 138

10.2 PUVAME overview - 139

10.3 Accident analysis and scenarios - 140

10.4 ParkNav platform - 141

10.4.1 The ParkView platform - 142

10.4.2 The CyCab vehicule - 144

10.5 Architecture of the system - 144

10.5.1 Interpretation of sensor data relative to the intersection - 145

10.5.2 Interpretation of sensor data relative to the vehicule - 149

10.5.3 Collision Risk Estimation - 149

10.5.4 Warning interface - 150

10.6 Experimental results - 151

V Conclusion and future work 153

11 Conclusion 154

11.1 Overview - 154

11.2 Future work - 155

12 French conclusion 157

VI Appendices 161

A Hello World GPGPU 162



B Hello World Brook 170

C Hello World CUDA 181

Statistiques de consultation

Administrateurs de l'archive uniquement : éditer cet enregistrement

 
ParisTech
 
droits de reproduction et de diffusion réservés © ParisTech 2007