Tabbyxl: Rule-based spreadsheet data extraction and transformation

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents an approach to rule-based spreadsheet data extraction and transformation. We determine a table object model and domain-specific language of table analysis and interpretation rules. In contrast to the existing data transformation languages, we draw up this process as consecutive steps: role analysis, structural analysis, and interpretation. To the best of our knowledge, there are no languages for expressing rules for transforming tabular data into the relational form in terms of the table understanding. We also consider a tool for transforming spreadsheet data from arbitrary to relational tables. The performance evaluation has been done automatically for both (role and structural) stages of table analysis with the prepared ground-truth data. It shows high F-score from 95.82% to 99.04% for different recovered items in the existing dataset of 200 arbitrary tables of the same genre (government statistics).

Cite

CITATION STYLE

APA

Shigarov, A., Khristyuk, V., Mikhailov, A., & Paramonov, V. (2019). Tabbyxl: Rule-based spreadsheet data extraction and transformation. In Communications in Computer and Information Science (Vol. 1078 CCIS, pp. 59–75). Springer. https://doi.org/10.1007/978-3-030-30275-7_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free