Comparative Analysis of Chatbot Systems

Hengsheng Xu; Linkun Wan; Yunyin Li; Jiaxi Liu; Adela S.M. Lau

Conference ProceedingsOPEN ACCESS

Comparative Analysis of Chatbot Systems

Xu H
Wan L
Li Y
et al.

Frontiers in Artificial Intelligence and Applications (2025) 412 392-398

DOI: 10.3233/FAIA250737

0Citations

16Readers

Get full text

Abstract

Existing research on chatbot evaluation suffers from inconsistent assessment standards, fragmented criteria, and insufficient coverage of critical dimensions like legal compliance and ethical alignment, which hinders reliable benchmarking of chatbots' performance. Our study proposes a comprehensive framework for such evaluation and systematically compares five chatbot systems: Tidio (Rule-Based), GPT-4o (AI-Powered), Claude 3.5 Sonnet (LLM), Watson Assistant (Enterprise), and Qwen2.5-Max (Multilingual) in terms of their accuracy, safety, legal compliance, generalizability of performance, and ethical alignment. We conclude that while chatbots enhance efficiency in healthcare (97.34% patient education completeness) and e-commerce (30%-40% cost reduction), critical limitations persist. Recommendations include: (1) retrieval-augmented generation (RAG) for hallucination reduction, (2) ethical governance frameworks (e.g., AILuminate), and (3) domain-specialized tuning. Cross-sector collaboration and standardized evaluations are essential for responsible deployment of AI.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, H., Wan, L., Li, Y., Liu, J., & Lau, A. S. M. (2025). Comparative Analysis of Chatbot Systems. In Frontiers in Artificial Intelligence and Applications (Vol. 412, pp. 392–398). IOS Press BV. https://doi.org/10.3233/FAIA250737

Comparative Analysis of Chatbot Systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions