(2022) Label-Descriptive Patterns and their Application to Characterizing Classification Errors.
|
Text
premise-hedderich,fischer,klakow,vreeken.pdf Download (528kB) | Preview |
Abstract
State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors, but also gives a way to act and improve the classifier. We propose to discover those feature- value combinations (ie. patterns) that strongly correlate with correct resp. erroneous predictions to obtain a global and interpretable description for arbitrary classifiers. We show this is an instance of the more general label description problem, which we formulate in terms of the Minimum De- scription Length principle. To discover a good pattern set, we develop the efficient PREMISE al- gorithm. Through an extensive set of experiments we show it performs very well in practice on both synthetic and real-world data. Unlike existing solutions, it ably recovers ground truth patterns, even on highly imbalanced data over many fea- tures. Through two case studies on Visual Ques- tion Answering and Named Entity Recognition, we confirm that PREMISE gives clear and action- able insight into the systematic errors made by modern NLP classifiers.
Item Type: | Conference or Workshop Item (A Paper) (Paper) |
---|---|
Divisions: | Jilles Vreeken (Exploratory Data Analysis) |
Conference: | ICML International Conference on Machine Learning |
Depositing User: | Sebastian Dalleiger |
Date Deposited: | 15 Jul 2022 10:32 |
Last Modified: | 15 Jul 2022 10:38 |
Primary Research Area: | NRA1: Trustworthy Information Processing |
URI: | https://publications.cispa.saarland/id/eprint/3727 |
Actions
Actions (login required)
View Item |