The dataset containing images with ground-truth shape for bodies, hands and faces together.

Data Collection

We begin with the SMPL+H dataset [52], obtaining one full body RGB image per frame. We then align SMPL-X to the 4D scans following [68]. An expert annotator manually curated the dataset to select 100 frames that can be confidently considered pseudo ground-truth, according to alignment quality and interesting hand poses and facial expressions. The pseudo ground-truth meshes allow to use a stricter vertexto-vertex (v2v) error metric [48, 62], in contrast to the common paradigm of reporting 3D joint error, which does not capture surface errors and rotations along the bones.'

  title = {Expressive Body Capture: 3D Hands, Face, and Body from a Single Image},
  author = {Pavlakos, Georgios and Choutas, Vasileios and Ghorbani, Nima and Bolkart, Timo
and Osman, Ahmed A. A. and Tzionas, Dimitrios and Black, Michael J.},
  booktitle = {Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
  year = {2019}



Data Summary
Provided by
Perceiving Systems
We combine research on computer vision, computer graphics, and machine learning to teach computers to see and understand humans and their behavior.
