科研项目

项目列表


Object Tracking


目标跟踪


Introduction

 

Learning online structural appearance model for robust object tracking

 

The main challenge of robust object tracking comes from the difficulty in designing an adaptive appearance model that is able to accommodate appearance variations. Existing tracking algorithms often perform self-updating of the appearance model with examples from recent tracking results to account for appearance changes. However, slight inaccuracy of tracking results can degrade the appearance model. In this paper, we propose a robust tracking method by evaluating an online structural appearance model based on local sparse coding and online metric learning. Our appearance model employs pooling of structural features over the local sparse codes of an object region to obtain a middle-level object representation. Tracking is then formulated by seeking for the most similar candidate within a Bayesian inference framework where the distance metric for similarity measurement is learned in an online manner to match the varying object appearance. Both qualitative and quantitative evaluations on various challenging image sequences demonstrate that the proposed algorithm outperforms the state-of-the-art methods.

 

Figure

 

 

Papers

 

Min Yang, Mingtao Pei, Yuwei Wu, and Yunde Jia. Learning Online Structural Appearance Model for Robust Object Tracking. SCIENCE CHINA Information Science, 58(3): 1-14, 2015. [PDF] [Code]

 

 

Metric learning based structural appearance model for robust visual tracking

 

Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multi-scale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods.

 

Figure

 

 

Papers

 

Yuwei Wu, Bo Ma, Min Yang, Yunde Jia, and Jian Zhang. Metric Learning based Structural Appearance Model for Robust Visual Tracking. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 24(5): 865-877, 2014. [PDF] [Code]

 

Min Yang, Caixia Zhang, Yuwei Wu, Mingtao Pei and Yunde Jia. Robust Object Tracking via Online Multiple Instance Metric Learning. IEEE International Conference on Multimedia and Expo Workshop (ICMEW), pp. 1-4, 2013. [PDF]

 

 

Online visual tracking by integrating spatio-temporal cues

 

The performance of online visual trackers has improved significantly, but designing an effective appearance-adaptive model is still a challenging task because of the accumulation of errors during the model updating with newly obtained results, which will cause tracker drift. In this study, the authors propose a novel online tracking algorithm by integrating spatiotemporal cues to alleviate the drift problem. The authors’ goal is to develop a more robust way of updating an adaptive appearance model. The model consists of multiple modules called temporal cues, and these modules are updated in an alternate way which can keep both the historical and current information of the tracked object to handle drastic appearance change. Each module is represented by several fragments called spatial cues. In order to incorporate all the spatial and temporal cues, the authors develop an efficient cue quality evaluation criterion that combines appearance and motion information. Then the tracking results are obtained by a two-stage dynamic integration mechanism. Both qualitative and quantitative evaluations on challenging video sequences demonstrate that the proposed algorithm performs more favorably against the state-of-the-art methods.

 

Figure

 

 

Papers

 

Yang He, Mingtao Pei, Min Yang, YuweiWu and Yunde Jia. Online Visual Tracking by Integrating Spatio-temporal Cues. IET Computer Vision, 9(1): 124-137, 2015. [PDF]

 

 

Online Discriminative Tracking with Active Example Selection

 

Most existing discriminative tracking algorithms use a sampling-and-labeling strategy to collect examples, and treat the training example collection as a task that is independent of classifier learning. However, the examples collected directly by sampling are neither necessarily informative nor intended to be useful for classifier learning. Updating the classifier with these examples might introduce ambiguity to the tracker. In this paper, we present a novel online discriminative tracking framework which explicitly couples the objectives of example collection and classifier learning. Our method uses Laplacian Regularized Least Squares (LapRLS) to learn a robust classifier that can sufficiently exploit unlabeled data and preserve the local geometrical structure of feature space. To ensure the high classification confidence of the classifier, we propose an active example selection approach to automatically select the most informative examples for LapRLS. Part of the selected examples that satisfy strict constraints are labeled to enhance the adaptivity of our tracker, which actually provides robust supervisory information to guide semi-supervised learning. With active example selection, we are able to avoid the ambiguity introduced by an independent example collection strategy, and to alleviate the drift problem caused by misaligned examples. Comparison with the state-of-the-art trackers on the comprehensive benchmark demonstrates that our tracking algorithm is more effective and accurate.

 

Figure

 

 

Papers

 

Min Yang, Yuwei Wu, Mingtao Pei, Bo Ma and Yunde Jia. Online Discriminative Tracking with Active Example Selection. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2015, in press. [PDF] [Code]

 

 

Robust Discriminative Tracking via Landmark-Based Label Propagation

 

The appearance of an object could be continuously changing during tracking, thereby being not independent identically distributed. A good discriminative tracker often needs a large number of training samples to fit the underlying data distribution, which is impractical for visual tracking. In this paper, we present a new discriminative tracker via landmark-based label propagation (LLP) that is nonparametric and makes no specific assumption about the sample distribution. With an undirected graph representation of samples, the LLP locally approximates the soft label of each sample by a linear combination of labels on its nearby landmarks. It is able to effectively propagate a limited amount of initial labels to a large amount of unlabeled samples. To this end, we introduce a local landmarks approximation method to compute the cross-similarity matrix between the whole data and landmarks. Moreover, a soft label prediction function incorporating the graph Laplacian regularizer is used to diffuse the known labels to all the unlabeled vertices in the graph, which explicitly considers the local geometrical structure of all samples. Tracking is then carried out within a Bayesian inference framework, where the soft label prediction value is used to construct the observation model. Both qualitative and quantitative evaluations on the benchmark data set containing 51 challenging image sequences demonstrate that the proposed algorithm outperforms the state-of-the-art methods.

 

Figure

 

 

Papers

 

Yuwei Wu, Mingtao Pei,Min Yang, Junsong Yuan, and Yunde Jia. Robust Discriminative Tracking via Landmark-based Label Propagation. IEEE Transactions on Image Processing (TIP), 24(5): 1510-1523, 2015. [PDF] [Code]

 

  

Temporal Dynamic Appearance Modeling for Online Multi-Person Tracking

 

Robust online multi-person tracking requires the correct associations of online detection responses with existing trajectories. We address this problem by developing a novel appearance modeling approach to provide accurate appearance affinities to guide data association. In contrast to most existing algorithms that only consider the spatial structure of human appearances, we exploit the temporal dynamic characteristics within temporal appearance sequences to discriminate different persons. The temporal dynamic makes a sufficient complement to the spatial structure of varying appearances in the feature space, which significantly improves the affinity measurement between trajectories and detections. We propose a feature selection algorithm to describe the appearance variations with midlevel semantic features, and demonstrate its usefulness in terms of temporal dynamic appearance modeling. Moreover, the appearance model is learned incrementally by alternatively evaluating newly-observed appearances and adjusting the model parameters to be suitable for online tracking. Reliable tracking of multiple persons in complex scenes is achieved by incorporating the learned model into an online tracking-by-detection framework. Our experiments on the challenging benchmark MOTChallenge 2015 demonstrate that our method outperforms the state-of-the-art multi-person tracking algorithms.

 

Figure

 

 

Papers

 

Min Yang and Yunde Jia. Temporal Dynamic Appearance Modeling for Online Multi-Person Tracking. arXiv preprint arXiv:1510.02906. [PDF]

 

 

  

Visual tracking with sparse correlation filters

 

Correlation filters have recently made significant improvements in visual object tracking on both efficiency and accuracy. In this paper, we propose a sparse correlation filter, which combines the effectiveness of sparse representation and the computational efficiency of correlation filters. The sparse representation is achieved through solving an ℓ0 regularized least squares problem. The obtained sparse correlation filters are able to represent the essential information of the tracked target while being insensitive to noise. During tracking, the appearance of the target is modeled by a sparse correlation filter, and the filter is re-trained after tracking on each frame to adapt to the appearance changes of the target. The experimental results on the CVPR2013 Online Object Tracking Benchmark (OOTB) show the effectiveness of our sparse correlation filter-based tracker.

 

Figure

 

 

Papers

Yanmei Dong, Min Yang, Mingtao Pei. Visual tracking with sparse correlation filters. ICIP, 2016. [PDF]