Thesis or Dissertation Dynamic Scenes and Appearance Modeling for Robust Object Detection and Matching Based on Co-occurrence Probability

梁, 棟

Detecting moving objects plays a crucial role in an intelligent surveillance system. Object detection is often integrated with various tasks, such as tracking objects, recognizing their behaviours and alerting when abnormal events occur. However, it suffers from non-stationary background in surveillance scenes, especially in two potentially dynamic cases: (1)sudden illumination variation, such as outdoor sunlight changes and indoor lights turning on/off; (2)burst physical motion, such as the motion of indoor artificial background, which include fans, escalators and auto-doors, and the motion of natural background, which include fountain, ripple on water surface and swaying tree. If the actual background includes a combination of any of these factors, it becomes much more difficult to detect objects. Traditional algorithms,i.e.Gaussian Mixture Model(GMM) and Kernel Density Estimation (KDE) handle gradual illumination changes by building the statistical background models progressively using long-term leaning frames. In practice, however, this kind of independent pixel-wise model often fail to avoid mistakenly integrating foreground elements into the background, and it is difficult to adapt to sudden illumination change and burst motion. On the other hand, spatial-dependence model, i.e. Grayscale Arranging Pairs(GAP) and Statistical Reach Feature(SRF), shows promising performance under illumination change and other dynamic background. This study proposes a novel framework to build a background model for object detection, which is evolved from GAP method and SRF method. It is brightness-invariant and able to tolerate burst motion. We name it Co-ocurrence Probability based Pixel Pairs (CP3). In order to model the dynamic background, spatial pixel pairs with high temporal Co-ocurrence probability are employed to represent each other by using the stable intensity differential increment between a pixel pair which is much more reliable than the intensity of a single pixel, especially when the intensity of a single pixel changes dramatically over time. The model performs robust detection under outdoor and indoor extreme environments. Compared with the independent pixel-wise background modelling methods, CP3 determines stable Co-ocurrence pixel pairs, instead of building the parameterized/non-parametrized model for a single pixel. These pixel pairs maintain a reliable background model, which can be used to capture structural background motion and cope with local and global illumination changes. As a spatial-dependence method, CP3 does not predefine/assume any local operator, subspace or block for an observed pixel, but it does its best effort to select those qualified supporting pixels which could maintain reliable linear relationship with the target pixels. Moreover, based on the single Gaussian model of the differential value of the pixel pair, it provides an accurate detection criterion even the gray-scale dynamic range is compressed under weak illumination. The proposed method can be used for modelling the appearance of an image to realize image matching. Theoretically speaking, both the object detection and image matching can be seen as a model matching problem. The differences between the two tasks is that the object detection is to seek the regions of interest(ROI) which violate/mismatch the background model, while the image matching is to seek the ROI which can match the image model optimally. Therefore, in this study, we further extend the use of CP3 to the robust image matching task. This thesis is organized into the following chapters: Chapter 1 introduces related works in object detection and image matching. Some general problems are involved and discussed. Furthermore, the motivations and contributions of this research are described. Chapter 2 presents the details of CP3 background model based on o-ocurrence pixel pairs for object detection. We test it on several surveillance video datasets for both qualitative and quantitative analyses. Experiments using several challenging datasets (Heavy fog, PETS-2001, AIST-INDOOR, Wallflower and a supermarket surveillance application) prove the robust and competitive performance of object detection in various indoor and outdoor environments. For quantitative analysis, Precision (also known as positive predictive value), Recall (also known as sensitivity) and F-measure (a weighted harmonic mean of the Precision and Recall) are utilized. The three evaluation metrics measure the exactness, fidelity and the completeness of foreground. We compare our algorithm with three methods: (1) GMM method, which is a standardized method among independent pixel-wise models; (2) Sheikh's KDE method as a representative method among spatially dependent models; (3) our previous method GAP. In addition, we also propose an accelerated version of CP3, which can effectively reduce the time cost of the background modelling stage. Chapter 3 proposes the framework of CP3 for modelling the appearance of an image to realize image matching. We detail the learning phase and present the similarity measure procedure and present the experimental results. Although an additional learning stage is necessary, the experiment results show that the proposed method is robust under several imaging cases and it also outperforms SRF. Chapter 4 presents the discussions of the proposed methods, concludes the main contributions of our study, and shows the future works of this study.
4, 6, 75p
Hokkaido University(北海道大学). 博士(情報科学)

Number of accesses :  

Other information