Expectation Maximization Using Relative Emission

Appendix II 89

II.2 Expectation Maximization Using Relative Emission

where the e in the superscript of the α, β, γ, and ξ variables denotes a particular sequence (1≤e≤E),O^e is theeth observation sequence, andT^eis its length.

II.2 Expectation Maximization Using Relative Emission

The re-estimation procedure of theλHMM, using one observation sequenceOis

Moreover, the transition probability re-estimation can be computed directly from the forward and backward variables:

II.2. Expectation Maximization Using Relative Emission 91

In Sec. 5.4.2.3 we dened the relative emission as follows:

˜bi(Ot) =

and denote the scaling coecients aset,k =h PN

j=1bj(Ot,k)i−1

. DenotingEt=QK_t

k=1et,k we get

˜bi(Ot) =Etbi(Ot).

Now we have to nd the relation between the original and the scaled forward α˜_t(i) and backwardβ˜_t(i) variables in order to make the changes of the Baum-Welch procedure Eq. II.6 if necessary. First at timet= 1we can write:

α1(i) =πi˜bi(O1) =πiE1bi(O1) =E1α1(i). (II.14) In the next stept= 2using the recursive formula of Eq. 2.27 we get:

Thus nally by induction we get rule:

II.2. Expectation Maximization Using Relative Emission 92

The direct re-estimation of the transition probabilities in Eq. II.11 is

¯ we can cancel them out from both the numerator and denominator. By the above results we easily proved that the Baum-Welch re-estimation algorithm Eq. II.6 can be used with the new relative emissions.

The procedure introduced above can be incorporated into the sequence scaling method of [48] (introduced in Sec. 2.4.5), where eachα_t(i)was scaled by the sum over all states of α_t(i). First we calculate the relative emissions followed by sequence scaling. The nal scaled forward and backward variables will be denoted byαˆt(i)andβˆt(i)and can be calculated as follows. The forward procedure starts fromt= 1, we can write:

α⁰₁(i) =π_i˜b_i(O₁) =π_iE₁b_i(O₁) =E₁α₁(i), (II.25)

II.2. Expectation Maximization Using Relative Emission 93

The backward variables are scaled by the sameˆct coecients. In the rst stept=T:

β_T⁰(i) = 1, (II.30)

II.2. Expectation Maximization Using Relative Emission 94

The direct re-estimation of the transition probabilities in Eq. II.11 is

is independent oft we can cancel them out from both the numerator and denominator.

For the computation of the log-likelihood function in Eq. II.12, used to evaluate the con-vergence of the re-estimation process, we can useαˆ_T(i) =h

τ=1cˆ_τE_τi

α_T(i)and sinceαˆ_T(i)are normalized their sum is equal to1, hence

P(O|λ) = 1

II.2. Expectation Maximization Using Relative Emission 95

or by usingE_t=QKt

k=1e_t,k:

logP(O|λ) =−

τ=1

log (ˆc_τ)−

τ=1 K_τ

k=1

log (e_t,k) . (II.43)

Bibliography 96

Bibliography

[1] E. L. Andrade, S. Blunsden, and R. B. Fisher. Performance analysis of event detection models in crowded scenes. In Proceedings of The International Conference on Visual Information Engineering, pages 427432, Bangalore, India, September 2628 2006.

[2] L. E. Baum. An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. In Inequalities III: Proceedings of the Third Symposium on Inequalities, pages 18, Los Angeles, CA, USA, 1972.

[3] L. E. Baum and J. A. Eagon. An inequality with applications to statistical estimation for probabilistic functions of a Markov process and to a model for ecology. Bulletin of the American Mathematical Society, 73(3):360363, 1967.

[4] L. E. Baum and T. Petrie. Statistical inference for probabilistic functions of nite state Markov chains. The Annals of Mathematical Statistics, 37(6):15541563, 1966.

[5] L. E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41(1):164171, 1970.

[6] L. E. Baum and L. R. Welch. A statistical estimation procedure for probabilistic functions of nite Markov processes. Submitted for publication to Proc. Nat. Acad. Sci. U.S.A.

[7] J. R. Bergen and R. Hingorani. Hierarchical motion-based frame rate conversion. Technical report, David Sarno Research Center Princeton NJ 08540, 1990.

[8] O. Boiman and M. Irani. Detecting irregularities in images and in video. International Journal of Computer Vision, 74(1):1731, 2007.

[9] J.-Y. Bouguet. Pyramidal implementation of the Lucas-Kanade feature tracker. Technical report, Intel Corp., Microprocessor Research Labs, 2000.

[10] T. Bouwmans, F. El Baf, and B. Vachon. Background modeling using mixture of Gaussians for foreground detection - a survey. Recent Patents on Computer Science, 1(3):219237, 2008.

[11] M. Brand and V. Kettnaker. Discovery and segmentation of activities in video. IEEE Trans-actions on Pattern Analysis and Machine Intelligence, 22(8):844851, 2000.

[12] Y. Chang, D. J. Lee, Y. Hong, and J. Archibald. Unsupervised video shot detection using clustering ensemble with a color global scale-invariant feature transform descriptor. Journal on Image and Video Processing, 8(1):110, 2008.

Bibliography 97

[13] D. Comaniciu, V. Ramesh, and P. Meer. Real-time tracking of non-rigid objects using mean shift. In Proceedings of The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 142149, Hilton Head, SC, USA, June 1315 2000.

[14] C. Cotsaces, N. Nikolaidis, and I. Pitas. Video shot detection and condensed representation.

A review. Signal Processing Magazine, IEEE, 23(2):2837, 2006.

[15] Cs. Beleznai, B. Frühstück, and H. Bischof. Human tracking by mode seeking. In Proceedings of The 4th International Symposium on Image and Signal Processing and Analysis, pages 16, Zagreb, Croatia, September 1517 2005.

[16] Cs. Benedek and T. Szirányi. Markovian framework for foreground-background-shadow sep-aration of real world video scenes. In Proceedings of The 7th Asian Conference on Computer Vision, pages 898907, Hyderabad, India, January 1316 2006.

[17] Cs. Benedek and T. Szirányi. Bayesian foreground and shadow detection in uncertain frame rate surveillance videos. IEEE Transactions on Image Processing, 4(17):608621, 2008.

[18] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati. Detecting moving objects, ghosts and shadows in video streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10):13371342, 2003.

[19] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1):138, 1977.

[20] A. R. Dick and M. J. Brooks. Issues in automated visual surveillance. In Proceedings of The 7th International Conference on Digital Image Computing: Techniques and Applications, pages 195204, Sydney, Australia, December 10-12 2003.

[21] A. Elgammal, D. Harwood, and L. Davis. Non-parametric model for background subtraction.

In Proceedings of IEEE International Conference on Computer Vision, Frame-rate Workshop, pages 751767, Kerkyra, Greece, September 2025 2000.

[22] J. D. Ferguson. Variable duration models for speech. In Proceedings of The Symposium on the Application of HMMs to Text and Speech, pages 143179, Princeton, NJ, USA, 1980.

[23] G. D. Forney. The Viterbi algorithm. Proceedings of the IEEE, 61(3):268278, 1973.

[24] N. Friedman and S. Russell. Image segmentation in video sequences: A probabilistic approach.

In Proceedings of The 13th Conference on Uncertainty in Articial Intelligence, pages 175181, Rhode Island, USA, August 13 1997.

[25] B. Georgescu, I. Shimshoni, and P. Meer. Mean Shift based clustering in high dimensions: A texture classication example. In Proceedings of The 9th IEEE International Conference on Computer Vision, pages 456463, Nice, France, October 1417 2003.

[26] K. J. Han and A. H. Tewk. Eigen-image based video segmentation and indexing. In Pro-ceedings of The International Conference on Image Processing, pages 538541, Washington, DC, USA, October 2629 1997.

[27] S. Hongeng, R. Nevatia, and F. Bremond. Video-based event recognition: Activity represen-tation and probabilistic recognition methods. Computer Vision and Image Understanding, 96(2):129162, 2004.

Bibliography 98

[28] J.-S. Hu and T.-M. Su. Robust background subtraction with shadow and highlight removal for indoor surveillance. EURASIP Journal on Applied Signal Processing, 2007(1):108108, 2007.

[29] W. Hu, T. Tan, L. Wang, and S. Maybank. A survey on visual surveillance of object motion and behaviors. IEEE Transactions on Systems, Man and Cybernetics, 34:334352, 2004.

[30] X. Huang, H. Ma, and H. Yuan. A hidden Markov model approach to parsing mtv video shot. In Proceedings of The Congress on Image and Signal Processing, pages 276280, Sanya, Hainan, China, May 2730 2008.

[31] V. Jain, B. B. Kimia, and J. L. Mundy. Background modeling based on subpixel edges.

In Proceedings of The International Conference on Image Processing, pages 321324, San Antonio, TX, USA, September 1619 2007.

[32] F. Jiang, Y. Wu, and A. K. Katsaggelos. Abnormal event detection from surveillance video by dynamic hierarchical clustering. In Proceedings of The International Conference on Image Processing, pages 145148, San Antonio, TX, USA, September 1619 2007.

[33] M. T. Johnson. Capacity and complexity of HMM duration modeling techniques. Signal Processing Letters, IEEE, 12(5):407410, 2005.

[34] R. E. Kalman. A new approach to linear ltering and prediction problems. Transactions of the ASMEJournal of Basic Engineering, 82(Series D):3545, 1960.

[35] Y. Kameda and M. Minoh. A human motion estimation method using 3-successive video frames. In Proceedings of The 2nd International Conference on Virtual Systems and Multi-Media, pages 135140, Gifu, Japan, September 1820 1996.

[36] K. P. Karmann and A. von Brandt. Moving object recognition using an adaptive background memory. In Proceedings of The 3rd International Workshop on Time-Varying Image Processing and Moving Object Recognition, Amsterdam, The Netherlands, 1990.

[37] D. Koller, J. Weber, and J. Malik. Robust multiple car tracking with occlusion reasoning.

Technical report, EECS Department, University of California, Berkeley, 1993.

[38] S. E. Levinson. Continuously variable duration hidden Markov models for automatic speech recognition. Computer Speech and Language, 1(1):2945, 1986.

[39] N. J. B. McFarlane and S. C. P. Segmentation and tracking of piglets in images. Machine Vision and Applications, 8(3):187193, 1995.

[40] C. D. Mitchell and L. H. Jamieson. Modeling duration in a hidden Markov model with the exponential family. In IEEE Proceedings of International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 331334, Minneapolis, MN, USA, April 27-30 1993.

[41] T. B. Moeslund and E. Granum. A survey of advances in vision-based human motion capture.

Computer Vision and Image Understanding, 81(3):231268, 2001.

[42] T. B. Moeslund, A. Hilton, and V. Krüger. A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2):90126, 2006.

[43] V. Nair and J. J. Clark. Automated visual surveillance using hidden Markov models. In Proceedings of The 15th International Conference on Vision Interface, pages 8894, Calgary, Canada, May 2729 2002.

Bibliography 99

[44] Notesco Oy, 2010. http://www.notesco.net/download/markkinakatsaus.pdf.

[45] OpenCV. Open Source Computer Vision library. http://opencv.willowgarage.com.

[46] I. Petrás, Cs. Beleznai, Y. Dedeo§lu, M. Pardàs, L. Kovács, Z. Szlávik, L. Havasi, T. Szirányi, B. U. Töreyin, U. Güdükbay, A. E. Çetin, and C. Canton-Ferrer. Flexible test-bed for unusual behavior detection. In Proceedings of The 6th ACM International Conference on Image and Video Retrieval, pages 105108, Amsterdam, The Netherlands, July 911 2007.

[47] F. Porikli. Human body tracking by adaptive background models and mean-shift analysis.

In Proceedings of The IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, Graz, Austria, March 31 2003.

[48] L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257286, 1989.

[49] P. Remagnino, A. Baumberg, T. Grove, D. Hogg, T. Tan, A. Worrall, and K. Baker. An integrated trac and pedestrian model-based vision system. In Proceedings of The 8th British Machine Vision Conference, pages 380389, Colchester, U.K., September 811 1997.

[50] M. J. Russell and R. K. Moore. Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition. In IEEE Proceedings of The International Conference on Acoustics, Speech, and Signal Processing, volume 10, pages 58, Tampa, FL, USA, March 2629 1985.

[51] C. Stauer and W. E. L. Grimson. Adaptive background mixture models for real-time tracking.

In Proceedings of The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pages 246252, Fort Collins, CO, USA, June 2325 1999.

[52] C. Stauer and W. E. L. Grimson. Learning patterns of activity using real-time tracking.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):747757, 2000.

[53] Z. Szlávik, L. Kovács, L. Havasi, Cs. Benedek, I. Petrás, Á. Utasi, A. Licsár, L. Czúni, and T. Szirányi. Behavior and event detection for annotation and surveillance. In Proceedings of the 6th International Workshop on Content-Based Multimedia Indexing, pages 117124, London, UK, June 1820 2008.

[54] E. Thul. An evaluation of Chris Stauer and W.E.L Grimson's method for background sub-traction. Technical report, School of Computer Science, McGill University, 2007.

[55] K. Toyama, J. Krumm, B. Brumitt, and B. Meyers. Wallower: Principles and practice of background maintenance. In Proceedings of The 7th IEEE International Conference on Computer Vision, volume 1, pages 255261, Kerkyra, Greece, September 2025 1999.

[56] Á. Utasi and L. Czúni. Reducing the foreground aperture problem in mixture of Gaussians based motion detection. In Proceedings of The 14th International Conference on Systems, Signals and Image Processing and 6th EURASIP Conference Focused on Speech and Image Processing, Multimedia Communications and Services, pages 157160, Maribor, Slovenia, June 2730 2007.

[57] Á. Utasi and L. Czúni. Valós idej¶ mozgásdetektálás módosított mixture of Gaussians eljárás-sal. In Proceedings of The 6th Conference of Hungarian Association for Image Processing and Pattern Recognition, Debrecen, Hungary, January 25-27 2007.

Bibliography 100

[58] Á. Utasi and L. Czúni. Anomaly detection with low-level processes in videos. In Proceedings of The 3rd International Conference on Computer Vision Theory and Applications, pages 678681, Funchal, Madeira, Portugal, January 2225 2008.

[59] Á. Utasi and L. Czúni. HMM-based unusual motion detection without tracking. In Proceedings of The 19th International Conference on Pattern Recognition, pages 14, Tampa, FL, USA, December 811 2008.

[60] Á. Utasi and L. Czúni. Visual analysis of urban road trac. In Proceedings of The 15th In-ternational Conference on Systems, Signals and Image Processing, pages 445448, Bratislava, Slovak Republic, June 2528 2008.

[61] Á. Utasi and L. Czúni. Analysis of time-multiplexed security videos. In Proceedings of The 6th IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 547552, Genoa, Italy, September 24 2009.

[62] Á. Utasi and L. Czúni. Detecting irregular camera events in time-multiplexed videos. Elec-tronics Letters, 45(18):937939, 2009.

[63] Á. Utasi and L. Czúni. Id®-multiplexelt biztonsági felvételek elemzése. In Proceedings of The 7th Conference of Hungarian Association for Image Processing and Pattern Recognition, Budapest, Hungary, January 2830 2009.

[64] Á. Utasi and L. Czúni. Detection of unusual optical ow patterns by multilevel hidden Markov models. Optical Engineering, 49(1), 2010.

[65] Á. Utasi and L. Czúni. Rendhagyó optikai áramlás detekciója rejtett Markov modellekkel.

In Proceedings of The 8th Conference of Hungarian Association for Image Processing and Pattern Recognition, Szeged, Hungary, January 2528 2011.

[66] N. Vaswani, A. R. Chowdhury, and R. Chellappa. Activity recognition using the dynamics of the conguration of interacting objects. In Proceedings of The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 633640, Madison, WI, USA, June 1622 2003.

[67] A. J. Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, Volume IT-13:260269, 1967.

[68] H. Wang and D. Suter. A re-evaluation of mixture of Gaussian background modeling. In Pro-ceedings of The IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 2, pages 1823, Philadelphia, PA, USA, March 1823 2005.

[69] S. Winkler. Vision Models and Quality Metrics for Image Processing Applications. PhD thesis, École Polytechnique Fédérale de Lausanne, 2000.

[70] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. Pnder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19:780785, 1997.

[71] T. Xiang and S. Gong. Beyond tracking: Modelling activity and understanding behaviour.

International Journal of Computer Vision, 67(1):2151, 2006.

[72] S.-Z. Yu and H. Kobayashi. An ecient forward-backward algorithm for an explicit-duration hidden Markov model. Signal Processing Letters, IEEE, 10(1):1114, 2003.

Bibliography 101

[73] M. Yuen and H. R. Wu. A survey of hybrid MC/DPCM/DCT video coding distortions. Signal Processing, 70(3):247278, 1998.

[74] Q. Zang and R. Klette. Parameter analysis for mixture of Gaussians model. Technical report, Department of Computer Science, The University of Auckland, 2006.

[75] H. C. Zeng and S. H. Lai. Adaptive foreground object extraction for real-time video surveil-lance with lighting variations. In Proceedings of The IEEE International Conference on Acous-tics, Speech, and Signal Processing, volume 1, pages 12011204, Honolulu, HI, USA, April 15-20 2007.

[76] D. Zhang, D. Gatica-Perez, S. Bengio, and I. McCowan. Semi-supervised adapted HMMs for unusual event detection. In Proceedings of The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 611618, San Diego, CA, USA, June 2026 2005.

[77] H. Zhong, J. Shi, and M. Visontai. Detecting unusual activity in video. In Proceedings of The IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pages 819826, Washington, DC, USA, June 27 July 2 2004.

Thesis Groups 102

Thesis Groups

1. Thesis group: analysis of time-multiplexed videos Related publications: [61], [62], [63].

The analog multi-camera surveillance systems often produce unsegmented time-multiplexed videos, and the multiplexer is usually not synchronized with the video recorder, that is no additional information about the cameras' temporal position in the video stream is available.

However, most of the methods developed for security tasks work for static cameras only.

Therefore, the rst step in a multi-camera system is automatic scene recognition.

Existing scene recognition methods do not consider the visual similarity of the images of a camera, the periodicity of the multiplexed segments, and regularity and uncertainty of the segments' duration at the same time. Therefore, I introduced novel hidden Markov (HMM) and hidden semi-Markov (HSMM) based methods, which take into account these considerations. In oine mode they provide an ecient tool for segmenting large amounts of archived data, while in online mode they can be used for the real-time detection of abnormal camera and multiplexer events, such as unusual camera order and duration, manual pan-tilt-zoom (PTZ) control, or device malfunction.

(a) Thesis: I designed new HMM and HSMM models for the automatic oine segmentation of time-multiplexed videos. Both methods assume two main attributes: the visual similarity of the segments of the same camera, and the periodicity of the segments in the stream. In addition to these, the HSMM-based method also assumes the uncertainty of the camera duration. In these models I used simple image features to retain high processing speed. I showed experimentally, that both methods can be eciently used for the segmentation of archived low-quality surveillance videos.

(b) Thesis: I introduced novel HMM and HSMM-based detectors for online scene recog-nition and anomalous camera event detection in time-multiplexed videos. The HMM-based method is capable to detect anomalous order, manual PTZ control and device malfunction events. Besides that, the HSMM-based detector can also be used for detect-ing unusually long or short camera durations. The proposed detectors have real-time

Thesis Groups 103

processing performance on ordinary PCs and provides high detection rate both on day and nighttime videos. I proved their practical applicability by using low-quality real-life recordings in my experiments.

2. Thesis group: foreground-background separation Related publications: [56], [57].

The separation of moving image parts from the background is an important task in video surveillance applications. The adaptive mixture of Gaussians (MoG) foreground-background separation method is one of the most widely used techniques for motion detection, with known deciencies induced by the so called foreground aperture problem. Due to this problem the original MoG approach fails in the aected scenarios. Therefore, I extended this method to improve its robustness against the foreground aperture problem, while retaining its real-time processing performance.

(a) Thesis: I introduced a novel extension to the adaptive MoG-based foreground-background separation method by modeling the foreground pixels in a separate layer using a sin-gle Gaussian. I dened a recursive method between neighboring models to propagate the high covariance values from the borders to the inner parts of homogeneous areas, thereby preventing them from becoming background. Moreover, I dened deterministic steps for the state change between the foreground and background models. According to my experiments, the improved method preserves the shapes of the moving objects more precisely, and improves the robustness of the method against the foreground aperture problem signicantly, achieving even 50% decrease in the number of misclassied pixels, while decreasing the processing speed by approximately 30%.

3. Thesis group: unusual event detection in surveillance videos Related publications: [58], [59], [60], [64], [65].

Most of the known methods for unusual event detection rely on the trajectories of objects.

However, object tracking based approaches work with high false alarm rate in cluttered urban environment. Therefore, I designed new methods for the detection of anomalous trac events and situations in urban surveillance videos. In situations, where object tracking is unreliable, my proposed methods are able to model the normal trac with the utilization of pixel-wise optical ow directions.

The proposed methods do not need any manual calibration or settings; they only require an

Thesis Groups 104

automatic training phase using videos of usual activity. In my experiments I used low-quality real-life videos to demonstrate the robustness of my methods against practical problems.

(a) Thesis: I introduced novel pixel-level modeling of optical ow directions to learn the usual motion patterns of the video. The usual motion directions are estimated in an automatic training phase. I designed a novel method for estimating the probabilities of unusual motions, which takes into account the temporal Markovian property of the motion vectors. According to my experiments, this temporal extension increases the dierence between the probabilities of the anomalous and usual events signicantly, thereby improves the anomaly detection performance of the methods.

(b) Thesis: I introduced a regional HMM-based unusual event detector, which learns the typical motion patterns and the uctuation of the trac of a region in the scene. The method uses the temporal changes of the extracted pixel-wise optical ow information to model the rules of the trac system.

In document Új valószínűségi módszerek videó-megfigyelési alkalmazásokhoz (Pldal 103-121)