Video Motion Capture from the Part Confidence Maps of Multi-Camera Images by Spatiotemporal Filtering Using the Human Skeletal Model

Takuya Ohashi1 Yosuke Ikegami1 Kazuki Yamamoto1 Wataru Takano2 Yoshihiko Nakamura1
1The Univerisity of Tokyo 2Osaka Univerisity
(IROS 2018, Madrid, Spain)

Abstract

We discuss video motion capture, namely, 3D reconstruction of human motion from multi-camera images. After the Part Confidence Maps are computed from each camera image, the proposed spatiotemporal filter is applied to deliver the human motion data with accuracy and smoothness for human motion analysis. The spatiotemporal filter uses the human skeleton and mixes temporal smoothing in two-time inverse kinematics computations. The experimental results show that the mean per joint position error was 26.1mm for regular motions and 38.8mm for inverted motions.

Paper

[arXiv] [IEEE]

IROS 2018 Best Paper Award, Finalists

Invited talk:
The 22th Meeting on Image Recognition and Understanding (MIRU, Japanese). 2019 [Link]

VMocap

Using this work, we developed realtime system called "VMocap". The system captures human motion, analyzes skeletal movements and muscle activities, and visualizes the results at 30 fps with 400-500ms latency. The current system uses four video cameras sharing different scopes of a human motion. The images of each camera captured at 30 fps are processed by a PC with GPU for OpenPose computation. The results from the four PC flow to a PC for 3D reconstruction of skeletal movements of higher dof. Inverse dynamics and muscle activity analysis then follows before visualization.

Media

Citation

@inproceedings{ohashi18iros,
  author = {Takuya Ohashi and Yosuke Ikegami and Kazuki Yamamoto and Wataru Takano and Yoshihiko Nakamura},
  title = {{Video Motion Capture from the Part Confidence Maps of Multi-Camera Images by Spatiotemporal Filtering Using the Human Skeletal Model}},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year = {2018},
}

Contact