PIP: Physical Interaction Prediction via Mental Simulation with Span Selection

0Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Accurate prediction of physical interaction outcomes is a crucial component of human intelligence and is important for safe and efficient deployments of robots in the real world. While there are existing vision-based intuitive physics models that learn to predict physical interaction outcomes, they mostly focus on generating short sequences of future frames based on physical properties (e.g. mass, friction and velocity) extracted from visual inputs or a latent space. However, there is a lack of intuitive physics models that are tested on long physical interaction sequences with multiple interactions among different objects. We hypothesize that selective temporal attention during approximate mental simulations helps humans in physical interaction outcome prediction. With these motivations, we propose a novel scheme: Physical Interaction Prediction via Mental Simulation with Span Selection (PIP). It utilizes a deep generative model to model approximate mental simulations by generating future frames of physical interactions before employing selective temporal attention in the form of span selection for predicting physical interaction outcomes. To the best of our knowledge, attention has not been used with deep learning to tackle intuitive physics. For model evaluation, we further propose the large-scale SPACE+ dataset of synthetic videos with long sequences of three prime physical interactions in a 3D environment. Our experiments show that PIP outperforms human, baseline, and related intuitive physics models that utilize mental simulation. Furthermore, PIP’s span selection module effectively identifies the frames indicating key physical interactions among objects, allowing for added interpretability, and does not require labor-intensive frame annotations. PIP is available on https://sites.google.com/view/piphysics.

Cite

CITATION STYLE

APA

Duan, J., Yu, S., Poria, S., Wen, B., & Tan, C. (2022). PIP: Physical Interaction Prediction via Mental Simulation with Span Selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13695 LNCS, pp. 405–421). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19833-5_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free