site stats

We propose a notion of affordance that takes into account physical quantities generated when the human body interacts with real-world objects, and introduce a learning framework that incorporates the concept of human utilities, which in our opinion provides a deeper and finer-grained account not only of object affordance but also of people's interaction with objects. Rather than defining affordance in terms of the geometric compatibility between body poses and 3D objects, we devise algorithms that employ physics-based simulation to infer the relevant forces/pressures acting on body parts. By observing the choices people make in videos (particularly in selecting a chair in which to sit) our system learns the comfort intervals of the forces exerted on body parts (while sitting). We account for people's preferences in terms of human utilities, which transcend comfort intervals to account also for meaningful tasks within scenes and spatiotemporal constraints in motion planning, such as for the purposes of robot task planning.


Please cite our paper if you use our code or data.

    title={Inferring forces and learning human utilities from videos},
    author={Zhu, Yixin and Jiang, Chenfanfu and Zhao, Yibiao and Terzopoulos, Demetri and Zhu, Song-Chun},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},


We thank Steven Holtzen, Siyuan Qi, Mark Edmonds, and Nishant Shukla for proofreading drafts and assistance with reviewing related work, and Chuyuan Fu for video voice overs. We also thank Professor Joseph Teran of the UCLA Math Department for useful discussions. The work reported herein was supported by DARPA SIMPLEX grant N66001-15-C-4035, ONR MURI grant N00014-16-1-2007, and DoD CDMRP grant W81XWH-15-1-0147.