(2018) Causal Inference on Event Sequences.
|
Text
cute-budhathoki,vreeken.pdf Download (707kB) | Preview |
Abstract
Given two discrete valued time series—that is, event sequences—of length n can we tell whether they are causally related? That is, can we tell whether x^n causes y^n, whether y^n causes x^n? Can we do so without having to make assumptions on the distribution of these time series, or about the lag of the causal effect? And, importantly for practical application, can we do so accurately and efficiently? These are exactly the questions we answer in this paper. We propose a causal inference framework for event sequences based on information theory. We build upon the well-known notion of Granger causality, and define causality in terms of compression. We infer that x^n is likely a cause of y^n if y^n can be (much) better sequentially compressed given the past of both y^n and x^n, than for the other way around. To compress the data we use the notion of sequential normalized maximal likelihood, which means we use minimax optimal codes with respect to a parametric family of distributions. To show this works in practice, we propose CUTE, a linear time method for inferring the causal direction between two event sequences. Empirical evaluation shows that CUTE works well in practice, is much more robust than transfer entropy, and ably reconstructs the ground truth on river flow and spike train data.
Item Type: | Conference or Workshop Item (A Paper) (Paper) |
---|---|
Divisions: | Jilles Vreeken (Exploratory Data Analysis) |
Conference: | SDM SIAM International Conference on Data Mining |
Depositing User: | Jilles Vreeken |
Date Deposited: | 07 Jun 2019 06:57 |
Last Modified: | 10 May 2021 11:38 |
Primary Research Area: | NRA5: Empirical & Behavioral Security |
URI: | https://publications.cispa.saarland/id/eprint/2904 |
Actions
Actions (login required)
View Item |