Parallel Queries for Human-Object Interaction Detection

Junwen Chen; Keiji Yanai

Conference ProceedingsOPEN ACCESS

Parallel Queries for Human-Object Interaction Detection

Proceedings of the 4th ACM International Conference on Multimedia in Asia, MMAsia 2022 (2022)

DOI: 10.1145/3551626.3564944

4Citations

5Readers

Get full text

Abstract

Human-Object Interaction (HOI) Detection requires localizing a pair of humans and objects. Recent transformer-based methods leverage the query embeddings to represent the entire HOI instances. The target embeddings after decoding are used to represent the object and human characteristics at the same time. However, it is ambiguous to use the highly integrated embeddings to localize the human and object simultaneously. To address this problem, we split the detection decoding process into subject decoding and object decoding to detect the humans and objects in parallel. Our proposed method, Parallel Query Network (PQNet) uses two transformer decoders to decode the subject embeddings and object embeddings in parallel, and a novel verb decoder is used to fuse the representation from the detection decoding and predict the interaction. The attention mechanisms in the verb decoder consist of the attention between human and object embeddings and the attention between the fused embeddings and global semantic features. As the transformer architecture maintains the permutation of the input query embeddings, the paired boxes of humans and objects are directly predicted by feed-forward networks. With the full usage of the object detection part, our proposed architecture outperforms the state-of-the-art baseline method with half of the training epochs.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, J., & Yanai, K. (2022). Parallel Queries for Human-Object Interaction Detection. In Proceedings of the 4th ACM International Conference on Multimedia in Asia, MMAsia 2022. Association for Computing Machinery, Inc. https://doi.org/10.1145/3551626.3564944

Parallel Queries for Human-Object Interaction Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions