Think Before You Classify: The Rise of Reasoning Large Language Models for Consumer Complaint Detection and Classification

Konstantinos I. Roumeliotis; Nikolaos D. Tselikas; Dimitrios K. Nasiopoulos

Journal ArticleOPEN ACCESS

Think Before You Classify: The Rise of Reasoning Large Language Models for Consumer Complaint Detection and Classification

Electronics (Switzerland) (2025) 14(6)

DOI: 10.3390/electronics14061070

7Citations

57Readers

Get full text

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in various natural language processing (NLP) tasks, but their effectiveness in real-world consumer complaint classification without fine-tuning remains uncertain. Zero-shot classification offers a promising solution by enabling models to categorize consumer complaints without prior exposure to labeled training data, making it valuable for handling emerging issues and dynamic complaint categories in finance. However, this task is particularly challenging, as financial complaint categories often overlap, requiring a deep understanding of nuanced language. In this study, we evaluate the zero-shot classification performance of leading LLMs and reasoning models, totaling 14 models. Specifically, we assess DeepSeek-V3, Gemini-2.0-Flash, Gemini-1.5-Pro, Anthropic’s Claude 3.5 and 3.7 Sonnet, Claude 3.5 Haiku, and OpenAI’s GPT-4o, GPT-4.5, and GPT-4o Mini, alongside reasoning models such as DeepSeek-R1, o1, and o3. Unlike traditional LLMs, reasoning models are specifically trained with reinforcement learning to exhibit advanced inferential capabilities, structured decision-making, and complex reasoning, making their application to text classification a groundbreaking advancement. The models were tasked with classifying consumer complaints submitted to the Consumer Financial Protection Bureau (CFPB) into five predefined financial classes based solely on complaint text. Performance was measured using accuracy, precision, recall, F1-score, and heatmaps to identify classification patterns. The findings highlight the strengths and limitations of both standard LLMs and reasoning models in financial text processing, providing valuable insights into their practical applications. By integrating reasoning models into classification workflows, organizations may enhance complaint resolution automation and improve customer service efficiency, marking a significant step forward in AI-driven financial text analysis.

Author supplied keywords

Cite

CITATION STYLE

APA

Roumeliotis, K. I., Tselikas, N. D., & Nasiopoulos, D. K. (2025). Think Before You Classify: The Rise of Reasoning Large Language Models for Consumer Complaint Detection and Classification. Electronics (Switzerland), 14(6). https://doi.org/10.3390/electronics14061070

Think Before You Classify: The Rise of Reasoning Large Language Models for Consumer Complaint Detection and Classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions