Data Extraction and Question Answering on Chart Images Towards Accessibility and Data Interpretation

0Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Graphical representations such as chart images are integral to web pages and documents. Automating data extraction from charts is possible by reverse-engineering the visualization pipeline. This study proposes a framework that automates data extraction from bar charts and integrates it with question-answering. The framework employs an object detector to recognize visual cues in the image, followed by text recognition. Mask-RCNN for plot element detection achieves a mean average precision of 95.04% at a threshold of 0.5 which decreases as the Intersection over Union (IoU) threshold increases. A contour approximation-based approach is proposed for extracting the bar coordinates, even at a higher IoU of 0.9. The textual and visual cues are associated with the legend text and preview, and the chart data is finally extracted in tabular format. We introduce an extension to the TAPAS model, called TAPAS++, by incorporating new operations and table question answering is done using TAPAS++ model. The chart summary or description is also produced in an audio format. In the future, this approach could be expanded to enable interactive question answering on charts by accepting audio inquiries from individuals with visual impairments and do more complex reasoning using Large Language Models.

Cite

CITATION STYLE

APA

Shahira, K. C., Joshi, P., & Lijiya, A. (2023). Data Extraction and Question Answering on Chart Images Towards Accessibility and Data Interpretation. IEEE Open Journal of the Computer Society, 4, 314–325. https://doi.org/10.1109/OJCS.2023.3328767

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free