Jiaolong YANG (杨蛟龙) (?) (Dual) PhD Candidate

Australian National University (ANU), Australia
Beijing Institute of Technology (BIT), China

yangjiaolong [at] bit.edu.cn
jiaolong.yang [at] anu.edu.au

ANU Homepage
About my name
Jiaolong (蛟龙): the given name from my parents. "Jiaolong" is an aquatic dragon in Chinese ancient legends with great power. It can be pronounced as "chiao-lung".
Yang (杨): the family name from my forefathers, the sixth most common surname in China. It can be pronounced as the word "young" with a rising tone.
[01/03/2015] The robust optical flow paper is accepted by CVPR'16
[16/02/2016] I will join Microsoft Research Asia (MSRA) as an Associate Researcher
[10/11/2015] The Go-ICP paper is accepted by T-PAMI
Bio
I'm a final-year dual PhD candidate at ANU and BIT. My supervisors are Prof. Hongdong Li (ANU) and Prof. Yunde Jia (BIT). I have broad research interests in computer vision and pattern recognition, including camera and image motion estimation, 3D reconstruction, 2D/3D registration, face recognition, deep learning etc.
Background
  • Jul   2016 - Present Visiting Student, School of Engineering and Applied Sciences, Harvard University
  • Nov 2015 - Mar 2016 Research Intern, Visual Computing Group, Microsoft Research Asia
  • Feb 2013 - Present PhD, College of Engineering and Computer Science, Australian National University
  • Sep 2010 - Present PhD, School of Computer Science, Beijing Institute of Technology
Publications (Google Scholar)
Jiaolong Yang, Peiran Ren, Dong Chen, Fang wen, Hongdong Li and Gang Hua
Neural Aggregation Network for Video Face Recognition
Technical Report
[Abstract] [BibTex][PDF][arXiv]
In this paper, we present a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with variable number of face frames as its input, and produces a compact and fixed-dimension visual representation of that person. The whole network is composed of two modules. The feature embedding module is a CNN which maps each face frame into a feature representation. The neural aggregation module is composed of two content based attention blocks which is driven by a memory storing all the features extracted from the face video through the feature embedding module. The output of the first attention block adapts the second, whose output is adopted as the aggregated representation of the video faces. Due to the attention mechanism, this representation is invariant to the order of the face frames. The experiments show that the proposed NAN consistently outperforms hand-crafted aggregations such as average pooling, and achieves state-of-the-art accuracy on three video face recognition datasets: the YouTube Face, IJB-A and Celebrity-1000 datasets.
@article{yang2016goicp,
  author = {Yang, Jiaolong and Ren, Peiran and Chen, Dong and Wen, Fang and Li, Hongdong and Hua, Gang},
  title = {Neural Aggregation Network for Video Face Recognition},
  journal = {arXiv:1603.05474},
  year = {2016}
}
Jiaolong Yang, Hongdong Li, Yuchao Dai and Robby T. Tan
Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection
The 32th IEEE Conference on Computer Vision and Pattern Recognition (CVPR2016), Las Vegas, USA
[Abstract] [BibTex] [PDF][Supplementary material]
This paper deals with a challenging, frequently encountered, yet not properly investigated problem in two-frame optical flow estimation. That is, the input frames are compounds of two imaging layers - one desired background layer of the scene, and one distracting, possibly moving layer due to transparency or reflection. In this situation, the conventional brightness constancy constraint - the cornerstone of most existing optical flow methods - will no longer be valid. In this paper, we propose a robust solution to this problem. The proposed method performs both optical flow estimation, and image layer separation. It exploits a generalized double-layer brightness consistency constraint connecting these two tasks, and utilizes the priors for both of them. Experiments on both synthetic data and real images have confirmed the efficacy of the proposed method. To the best of our knowledge, this is the first attempt towards handling generic optical flow fields of two-frame images containing transparency or reflection.
@inproceedings{yang2016goicp,
  author = {Yang, Jiaolong and Li, Hongdong and Dai, Yuchao and Tan, Robby T.},
  title = {Robust Optical Flow Estimation of Double-Layer Images under Transparency or Reflection},
  booktitle = {Proceedings of the 32th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2016}
}
Jiaolong Yang, Hongdong Li, Dylan Campbell and Yunde Jia
Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2016 (In press)
[Abstract] [BibTex] [PDF] [Code] [Webpage] [Supplementary material]
The Iterative Closest Point (ICP) algorithm is one of the most widely used methods for point-set registration. However, being based on local iterative optimization, ICP is known to be susceptible to local minima. Its performance critically relies on the quality of the initialization and only local optimality is guaranteed. This paper presents the first globally optimal algorithm, named Go-ICP, for Euclidean (rigid) registration of two 3D point-sets under the L2 error metric defined in ICP. The Go-ICP method is based on a branch-and-bound (BnB) scheme that searches the entire 3D motion space SE(3). By exploiting the special structure of SE(3) geometry, we derive novel upper and lower bounds for the registration error function. Local ICP is integrated into the BnB scheme, which speeds up the new method while guaranteeing global optimality. We also discuss extensions, addressing the issue of outlier robustness. The evaluation demonstrates that the proposed method is able to produce reliable registration results regardless of the initialization. Go-ICP can be applied in scenarios where an optimal solution is desirable or where a good initialization is not always available.
@article{yang2016goicp,
  author = {Yang, Jiaolong and Li, Hongdong and Campbell, Dylan and Jia, Yunde},
  title = {Go-ICP: A Globally Optimal Solution to 3D ICP Point-Set Registration},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)},
  year = {2016 (In press)}
}
Jiaolong Yang and Hongdong Li
Dense, Accurate Optical Flow Estimation with Piecewise Parametric Model
The 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR2015), Boston, USA
[Abstract] [BibTex] [PDF] [Code] [Extended Abstract] [Supplementary material]
This paper proposes a simple method for estimating dense and accurate optical flow field. It revitalizes an early idea of piecewise parametric flow model. A key innovation is that we fit a flow field piecewise to a variety of parametric models, where the domain of each piece (i.e., each piece's shape, position and size) as well as the total number of pieces are determined adaptively, while at the same time maintaining a global inter-piece flow continuity constraint. We achieve this by a multi-model fitting scheme via energy minimization. Our energy takes into account both the piecewise constant model assumption, and the flow field continuity constraint. The proposed method effectively handles both homogeneous regions and complex motion. Experiments on three public optical flow benchmarks (KITTI, MPI Sintel, and Middlebury) show the superiority of our method compared with the state of the art: it achieves top-tier performances on all the three benchmarks.
@inproceedings{yang2015dense,
  author = {Yang, Jiaolong and Li, Hongdong},
  title = {Dense, Accurate Optical Flow Estimation with Piecewise Parametric Model},
  booktitle = {Proceedings of the 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages = {1019-1027},
  year = {2015}
}
Jiaolong Yang, Hongdong Li and Yunde Jia
Optimal Essential Matrix Estimation via Inlier-Set Maximization
The 13th European Conference on Computer Vision (ECCV2014), Zürich, Switzerland
Received a Student Conference Grant
[Abstract] [BibTex] [PDF] [Code] [Data]
In this paper, we extend the globally optimal "rotation space search" method [11] to essential matrix estimation in the presence of feature mismatches or outliers. The problem is formulated as inlier-set cardinality maximization, and solved via branch-and-bound global optimization which searches the entire essential manifold formed by all essential matrices. Our main contributions include an explicit, geometrically meaningful essential manifold parametrization using a 5D direct product space of a solid 2D disk and a solid 3D ball, as well as efficient closed-form bounding functions. Experiments on both synthetic data and real images have confirmed the efficacy of our method. The method is mostly suitable for applications where robustness and accuracy are paramount. It can also be used as a benchmark for method evaluation.
@inproceedings{yang2014optimal,
  author = {Yang, Jiaolong and Li, Hongdong and Jia, Yunde},
  title = {Optimal Essential Matrix Estimation via Inlier-Set Maximization},
  booktitle = {Proceedings of the 14th European Conference on Computer Vision (ECCV)},
  pages = {111-126},
  year = {2014}
}
Jiaolong Yang, Hongdong Li and Yunde Jia
Go-ICP: Solving 3D Registration Efficiently and Globally Optimally
The 14th International Conference on Computer Vision (ICCV2013), Sydney, Australia
[Abstract] [BibTex] [PDF] [Code] [Webpage]
Registration is a fundamental task in computer vision. The Iterative Closest Point (ICP) algorithm is one of the widely-used methods for solving the registration problem. Based on local iteration, ICP is however well-known to suffer from local minima. Its performance critically relies on the quality of initialization, and only local optimality is guaranteed. This paper provides the very first globally optimal solution to Euclidean registration of two 3D pointsets or two 3D surfaces under the L2 error. Our method is built upon ICP, but combines it with a branch-and-bound (BnB) scheme which searches the 3D motion space SE(3) efficiently. By exploiting the special structure of the underlying geometry, we derive novel upper and lower bounds for the ICP error function. The integration of local ICP and global BnB enables the new method to run efficiently in practice, and its optimality is exactly guaranteed. We also discuss extensions, addressing the issue of outlier robustness.
@inproceedings{yang2013goicp,
  author = {Yang, Jiaolong and Li, Hongdong and Jia, Yunde},
  title = {Go-ICP: Solving 3D Registration Efficiently and Globally Optimally},
  booktitle = {Proceedings of the 14th International Conference on Computer Vision (ICCV)},
  pages = {1457-1464},
  year = {2013}
}
Jiaolong Yang, Yuchao Dai, Hongdong Li, Henry Gardner and Yunde Jia
Single-shot Extrinsic Calibration of a Generically Configured RGB-D Camera Rig from Scene Constraints
The 12th International Symposium on Mixed and Augmented Reality (ISMAR2013), Adelaide, Australia
Regular Paper with Oral Presentation
[Abstract] [BibTex] [PDF] [Slides]
With the increasingly popular use of commodity RGB-D cameras for computer vision, robotics, mixed and augmented reality and other areas, it is of significant practical interest to calibrate the relative pose between a depth (D) camera and an RGB camera in these types of setups. In this paper, we propose a new single-shot, correspondence-free method to extrinsically calibrate a generically configured RGB-D camera rig. We formulate the extrinsic calibration problem as one of geometric 2D-3D registration which exploits scene constraints to achieve single-shot extrinsic calibration. Our method first reconstructs sparse point clouds from single view 2D image, which are then registered with dense point clouds from the depth camera. Finally, we directly optimize the warping quality by evaluating scene constraints in 3D point clouds. Our single-shot extrinsic calibration method does not require correspondences across multiple color images or across modality, achieving greater flexibility over existing methods. The scene constraints required by our method can be very simple and we demonstrate that a scene made up of three sheets of paper is sufficient to obtain reliable calibration and with lower geometric error than existing methods.
@inproceedings{yang2013single,
  author = {Yang, Jiaolong and Dai, Yuchao and Li, Hongdong and Gardner, Henry and Jia, Yunde},
  title = {Single-shot Extrinsic Calibration of a Generically Configured RGB-D Camera Rig from Scene Constraints},
  booktitle = {Proceedings of the 12th International Symposium on Mixed and Augmented Reality (ISMAR)},
  pages = {181-188},
  year = {2013}
}
Jiaolong Yang, Wei Liang and Yunde Jia
Face Pose Estimation with Combined 2D and 3D HOG Features
The 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan
[Abstract] [BibTex] [PDF]
In this paper, a new stereo camera calibration technique that can realize automatic strong calibration is proposed. In order to achieve online camera calibration, an object covered with chess-board patterns, called embedded calibration device, is placed inside the cavity of the stereovision system. We estimate the structural configuration of the embedded calibration device, i.e. the 3D positions of all the grid points on the device, to calibrate the cameras. Since the device is close to the stereo camera, the calibration results are usually not valid for the volume around the object in the scene. Therefore we present a correction approach combining the embedded calibration and scene features to make the calibration valid in the scene. Experimental results demonstrate that our system performs robust and accurate, and is very applicable in unmanned systems.
@inproceedings{yang2012face,
  author = {Yang, Jiaolong and Liang, Wei and Jia, Yunde},
  title = {Face Pose Estimation with Combined 2D and 3D HOG Features},
  booktitle = {Proceedings of the 21st International Conference on Pattern Recognition (ICPR)},
  pages = {2492-2495},
  year = {2012}
}
Earlier:
  • Xiameng Qin, Jiaolong Yang, Wei Liang, Mingtao Pei and Yunde Jia. Stereo Camera Calibration with an Embedded Calibration Device and Scene Features. IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2306-2310, 2012.
  • Jiaolong Yang, Lei Chen and Yunde Jia. Human-robot Interaction Technique Based on Stereo Vision. Chinese Conference on Human Computer Interaction (CHCI), pp. 226-231, 2011. (in Chinese) Oral
  • Jiaolong Yang, Lei Chen and Wei Liang. Monocular Vision based Robot Self-localization. IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 1189-1193, 2010. Oral
  • Lei Chen, Mingtao Pei and Jiaolong Yang. Multi-Scale Matching for Data Association in Vision-based SLAM. IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 1183-1188, 2010. Oral
Honors & Awards
  • Excellent BIT PhD Thesis Award (with exceptional award to supervisor, 0.6%), 2016
  • Excellent Intern Award, Microsoft Research, 2016
  • ANU Postgraduate Research Scholarship, 2015
  • Pacemaker to Outstanding Graduate Students of BIT (1%), 2015
  • Huawei Scholarship, 2014
  • National Scholarship for Graduate Students, 2013
  • Outstanding Graduate Student Cadre of BIT, 2013
  • Chinese Government Scholarship (CSC Scholarship), 2013
  • National Scholarship for Graduate Students, 2012
  • Outstanding Graduate Student of BIT,2012
  • Top-class Scholarship (5%) for Doctoral Students of BIT, 2012
  • Second Prize (Top-3) in BIT Information Security & Countermeasures Contest, 2009
  • First-class Microsoft Scholarship of Technology Innovation (Team), 2008
  • Second Prize in National Information Security Contest for College Students, 2008
  • Bronze Award in ACM/ICPC Provincial League (Nanjing), 2007
Academic Services
Conference Reviewer: CVPR2015, ICCV2015, CVPR2016, ECCV2016
Journal Reviewer: T-PAMI, T-IP, T-CSVT, T-MM, T-ITS, MVA, IET-CV, Acta Automatica Sinica
Teaching & Tutoring
  • 2015-2016Semester 2Robotics (ENGN6627/ENGN4627)ANU
  • 2012-2013Semester 1Foundations of Computer ScienceBIT
  • 2011-2012Semester 2PHP for the WebBIT