Enabling Type Checking on Columns in Data Frame Libraries by Abstract Interpretation

Yungyu Zhuang; Ming Yang Lu

Journal ArticleOPEN ACCESS

Enabling Type Checking on Columns in Data Frame Libraries by Abstract Interpretation

IEEE Access (2022) 10 14418-14428

DOI: 10.1109/ACCESS.2022.3146287

2Citations

9Readers

Abstract

Data frames are a tabular data structure widely used in transforming data to an appropriate form in data analysis, especially in data wrangling. However, when data frames are implemented with libraries rather than supported at the language level, it is hard to find out errors due to the limitation of type checking on columns. Data scientists may encounter errors due to missing column labels or inconsistent types, especially when they reuse code snippets for new data. These errors are usually left to runtime, and it is sometimes difficult to find out where the problems are. To address this issue, we propose using abstract interpretation to perform type checking on data frame columns. We defined the type for data frames based on column labels and developed semantics to verify the typing in general operations. A static checker can be implemented based on the semantics to help programmers quickly fix errors without executing the code. To show the feasibility, we implemented a proof-of-concept for the pandas library as an example, PDChecker, to discuss the limitation and usage. It is then used to compare the functionalities with existing solutions. The results show our approach can fulfill the function of type checking for data frames. Supporting more data frame operations is included in our future work.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhuang, Y., & Lu, M. Y. (2022). Enabling Type Checking on Columns in Data Frame Libraries by Abstract Interpretation. IEEE Access, 10, 14418–14428. https://doi.org/10.1109/ACCESS.2022.3146287

Enabling Type Checking on Columns in Data Frame Libraries by Abstract Interpretation

Abstract

Author supplied keywords

Cite

Register to see more suggestions