Abstract
We propose a PiggyBack, a Visual Question Answering platform that allows users to apply the state-of-the-art visual-language pretrained models easily. We integrate visual-language models, pretrained by HuggingFace, an open-source API platform of deep learning technologies; however, it cannot be runnable without programming skills or deep learning understanding. Hence, our PiggyBack supports an easy-to-use browser-based user interface with several deep-learning visual language pretrained models for general users and domain experts. The PiggyBack includes the following benefits: Portability due to web-based and thus runs on almost any platform, A comprehensive data creation and processing technique, and ease of use on visual language pretrained models. The demo video can be found at https://youtu.be/iz44RZ1lF4s.
Author supplied keywords
Cite
CITATION STYLE
Zhang, Z., Luo, S., Chen, J., Lai, S., Long, S., Chung, H., & Han, S. C. (2023). PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals. In WSDM 2023 - Proceedings of the 16th ACM International Conference on Web Search and Data Mining (pp. 1152–1155). Association for Computing Machinery, Inc. https://doi.org/10.1145/3539597.3573039
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.