(2022) Discovering Significant Patterns under Sequential False Discovery Control.
|
Text
spass-dalleiger,vreeken.pdf - Accepted Version Download (860kB) | Preview |
Abstract
We are interested in discovering those patterns from data with an empirical frequency that is significantly differently than expec- ted. To avoid spurious results, yet achieve high statistical power, we propose to sequentially control for false discoveries during the search. To avoid redundancy, we propose to update our expect- ations whenever we discover a significant pattern. To efficiently consider the exponentially sized search space, we employ an easy- to-compute upper bound on significance, and propose an effective search strategy for sets of significant patterns. Through an extens- ive set of experiments on synthetic data, we show that our method, Spass, recovers the ground truth reliably, does so efficiently, and without redundancy. On real-world data we show it works well on both single and multiple classes, on low and high dimensional data, and through case studies that it discovers meaningful results.
Item Type: | Conference or Workshop Item (A Paper) (Paper) |
---|---|
Divisions: | Jilles Vreeken (Exploratory Data Analysis) |
Conference: | KDD ACM International Conference on Knowledge Discovery and Data Mining |
Depositing User: | Sebastian Dalleiger |
Date Deposited: | 15 Jul 2022 10:40 |
Last Modified: | 15 Jul 2022 10:40 |
Primary Research Area: | NRA1: Trustworthy Information Processing |
URI: | https://publications.cispa.saarland/id/eprint/3726 |
Actions
Actions (login required)
View Item |