The work “CosyPose: Consistent multi-view multi-object 6D pose estimation” co-authored by Yann Labbé (Inria), Justin Carpentier (Inria), Mathieu Aubry (ENPC), and Josef Sivic (CIIRC CTU) won 5 awards at the 6D object pose estimation challenge at ECCV 2020.
6D object pose estimation is one of the key problems in enabling autonomous systems equipped with visual sensors to accurately estimate the positions of objects in the environment in order to grasp and manipulate them. Think, for example, of an assistant robot at home, that can automatically fill-in a dish-washer or a manufacturing robot manipulating objects on the production line.
The CosyPose approach estimates the 6D pose of multiple known objects in a scene captured by one or more input images with unknown camera viewpoints. The main innovation is a marriage of powerful deep neural networks trained from a combination of synthetic and real images together with multi-view geometric constraints.
6D object pose estimation optimizing multi-view COnSistencY. Given a set of RGB images depicting a scene with known objects taken from unknown viewpoints (top image), our method accurately reconstructs the scene (bottom image) recovering all objects in the scene, their 6D pose and the camera viewpoints.
The proposed approach achieves state-of-the-art results on multiple benchmarks, doubling the performance of existing methods on the most complex datasets that were beyond the capabilities of previous systems. The method won 5 awards including “The Overall Best Method” in the 6D object pose estimation challenge at ECCV 2020, outperforming other competitors by a significant margin in multiple categories and reaching the performance of methods that use further information from costly depth sensors (which are also sensitive to illumination) while only using non-expensive RGB cameras. The results of the challenge are available as slides and as an overview paper.
The 6D object pose estimation challenge was held in conjunction with the European Conference on Computer Vision, (ECCV) 2020. ECCV is one of the top three computer vision conferences (together with CVPR and ICCV) and is listed among the top 100 most cited journals and conferences over all areas of science by Google Scholar.