PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep-learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on visual language pretrained models. The demo video can be found at https://youtu.be/iz44RZ1lF4s.

Cite

CITATION STYLE

APA

Zhang, Z., Luo, S., Chen, J., Lai, S., Long, S., Chung, H., & Han, S. C. (2023). PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals. In WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining (pp. 1152–1155). Association for Computing Machinery, Inc. https://doi.org/10.1145/3539597.3573039

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free