FILA: Online Auditing of Machine Learning Model Accuracy under Finite Labelling Budget

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine learning (ML) is increasingly adopted in industrial applications. Typically, a ML pipeline is instantiated to automate the process of collecting training data, training a model, auditing the model accuracy and generating predictions. In this paper we present a sampling based approach to audit the model accuracy in ML pipelines in an online manner, with general applicability, including entity resolution. We target an online setting in which a deployed model makes predictions, which can be selectively assessed for accuracy by humans in the loop. We present a consistent adaptive stratified sampling estimator for model accuracy and propose the Finite Labels (FILA) method to allocate samples under a finite label budget. We demonstrate that under mild statistical assumptions FILA is asymptotically optimal. We analyze the variance of the estimator under a finite labelling budget and compare our approach to other applicable techniques and analytically establish the conditions under which our proposed FILA is the method of choice. We also present an algorithm based on Thompson Sampling, named FILA-Thompson, utilizing explore-exploit trade-offs to estimate model accuracy. Finally we present the results of a thorough experimental evaluation using real benchmark data sets demonstrating the practical utility (in terms of estimation accuracy and variance minimization under finite samples) of our proposals compared to other applicable approaches.

Cite

CITATION STYLE

APA

Guan, N., & Koudas, N. (2022). FILA: Online Auditing of Machine Learning Model Accuracy under Finite Labelling Budget. In Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 1784–1794). Association for Computing Machinery. https://doi.org/10.1145/3514221.3517904

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free