site stats

Contact forces of the hand are visually unobservable, but play a crucial role in understanding hand-object interactions. In this paper, we propose an unsupervised learning approach for manipulation event segmentation and manipulation event parsing. The proposed framework incorporates hand pose kinematics and contact forces using a low-cost easy-to-replicate tactile glove. We use a temporal grammar model to capture the hierarchical structure of events, integrating extracted force vectors from the raw sensory input of poses and forces. The temporal grammar is represented as a temporal And-Or graph (T-AOG), which can be induced in an unsupervised manner. We obtain the event labeling sequences by measuring the similarity between segments using the Dynamic Time Alignment Kernel (DTAK). Experimental results show that our method achieves high accuracy in manipulation event segmentation, recognition and parsing by utilizing both pose and force data.


Please cite our paper if you use our code or data.

    title={Unsupervised Learning of Hierarchical Models for Hand-Object Interactions},
    author={Xie, Xu and Liu, Hangxin and Edmonds, Mark and Gao, Feng and Qi, Siyuan and Zhu, Yixin and Rothrock, Brandon and Zhu, Song-Chun},
    booktitle={International Conference on Robotics and Automation (ICRA)},

The work reported herein was supported by DARPA XAI grant N66001-17-2-4029 and ONR MURI grant N00014-16-1-2007