Vision, Voice, and Text: Pioneering Zero-shot Multimodal LLMs for Sentiment-driven Investment

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In the rapidly evolving financial landscape, sentiment analysis has emerged as a critical tool for decoding market dynamics, yet traditional approaches remain confined to textual data, overlooking the rich multimodal cues embedded in audio and video. This paper unveils a pioneering zero-shot framework that harnesses Multimodal Large Language Models (MLLMs) to revolutionize sentiment-driven investment by integrating text, audio, and video modalities. We introduce a comprehensive suite of metrics to extract nuanced emotional signals, a self-consistent signal verification mechanism to enhance market prediction reliability, and a JSON schema for seamless automation. To validate this innovation, we curate the White House Press Briefing (WHPB) Video Benchmark Database, a novel dataset of 30 press briefings from January to July 2025, offering a robust testbed for multimodal analysis. Our extensive experiments demonstrate that the full-multimodal approach, leveraging text, audio, and video, outperforms text-only and text-audio baselines, achieving superior returns across diverse assets, including a remarkable 2,843.9% annualized return on the VIX. This work not only redefines financial sentiment analysis but also sets a transformative foundation for AI-driven investment strategies, empowering investors with unprecedented insights into market sentiment. Our WHPH database is available at https://github.com/sutan244/White-House-Press-Briefing-Video-Benchmark-Dataset-WHPB.

Cite

CITATION STYLE

APA

Tan, S., So, C. C., Sun, Y., Wang, J. M., Loh, W. K. A., & Yung, S. P. (2025). Vision, Voice, and Text: Pioneering Zero-shot Multimodal LLMs for Sentiment-driven Investment. In ICAIF 2025 - 6th ACM International Conference on AI in Finance (pp. 960–968). Association for Computing Machinery, Inc. https://doi.org/10.1145/3768292.3770368

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free