(2022) Discovering Significant Patterns under Sequential False Discovery Control.
  | 
            
              
Text
 spass-dalleiger,vreeken.pdf - Accepted Version Download (860kB) | Preview  | 
          
Abstract
We are interested in discovering those patterns from data with an empirical frequency that is significantly differently than expec- ted. To avoid spurious results, yet achieve high statistical power, we propose to sequentially control for false discoveries during the search. To avoid redundancy, we propose to update our expect- ations whenever we discover a significant pattern. To efficiently consider the exponentially sized search space, we employ an easy- to-compute upper bound on significance, and propose an effective search strategy for sets of significant patterns. Through an extens- ive set of experiments on synthetic data, we show that our method, Spass, recovers the ground truth reliably, does so efficiently, and without redundancy. On real-world data we show it works well on both single and multiple classes, on low and high dimensional data, and through case studies that it discovers meaningful results.
| Item Type: | Conference or Workshop Item (A Paper) (Paper) | 
|---|---|
| Divisions: | Jilles Vreeken (Exploratory Data Analysis) | 
| Conference: | KDD ACM International Conference on Knowledge Discovery and Data Mining | 
| Depositing User: | Sebastian Dalleiger | 
| Date Deposited: | 15 Jul 2022 10:40 | 
| Last Modified: | 15 Jul 2022 10:40 | 
| Primary Research Area: | NRA1: Trustworthy Information Processing | 
| URI: | https://publications.cispa.saarland/id/eprint/3726 | 
Actions
Actions (login required)
![]()  | 
        View Item | 
        