SEAL: Interactive Tool for Systematic Error Analysis and Labeling

7Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the advent of Transformers, large language models (LLMs) have saturated well-known NLP benchmarks and leaderboards with high aggregate performance. However, many times these models systematically fail on tail data or rare groups not obvious in aggregate evaluation. Identifying such problematic data groups is even more challenging when there are no explicit labels (e.g., ethnicity, gender, etc.) and further compounded for NLP datasets due to the lack of visual features to characterize failure modes (e.g., Asian males, animals indoors, waterbirds on land etc.). This paper introduces an interactive Systematic Error Analysis and Labeling (SEAL) tool that uses a two-step approach to first identify high error slices of data and then in the second step introduce methods to give human-understandable semantics to those under-performing slices. We explore a variety of methods for coming up with coherent semantics for the error groups using language models for semantic labeling and a text-to-image model for generating visual features. SEAL toolkit and demo screencast is available at https://huggingface.co/spaces/nazneen/seal.

Cite

CITATION STYLE

APA

Rajani, N., Liang, W., Chen, L., Mitchell, M., & Zou, J. (2022). SEAL: Interactive Tool for Systematic Error Analysis and Labeling. In EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Demonstrations Session (pp. 359–370). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-demos.36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free