Researchers are grappling with how to identify cause and effect in diverse time-series data, where a single model can’t account for various causal mechanisms. Most traditional methods used for casual discovery from this type of data typically presume a uniform causal structure across the entire dataset. However, real-world data is often highly complex and multi-modal, making this presumption potentially oversimplifying.
Criteria such as Granger causality often fail to account for true causality. Structural Causal Models (SCMs) provide a more comprehensive framework but often presume linear relationships and a consistent causal structure. Some newer techniques are more adaptive, but still have limitations. Specifically, they still presume a single causal graph, and focus largely on independent data, leaving a gap in dealing with temporal dependencies in causal discovery for time-series data.
To answer this challenge, researchers from UCSD propose a method called Mixture Causal Discovery (MCD). This approach presumes that the data is generated from a combination of unknown SCMs, learning the complete SCMs and the corresponding sample origin for each time series sample. Two variants of MCD are presented: a linear model and a nonlinear model, which uses neural networks to model functional relationships and history-dependent noise.
This approach addresses the limitations of existing methods that presume a single causal model for the entire dataset, and represents a significant progression in causal discovery for heterogeneous time-series data. MCD allows for an understanding of multiple SCMs and where samples originate at the same time.
MCD performed well on synthetic datasets and real-world scenarios. The researchers were able to achieve two distinct causal graphs that reflected significant sector interactions and identified important market events. Ultimately, MCD offers a solution to the challenge of causal discovery in complex, multimodal real-life scenarios.
The researchers stressed that MCD is a flexible framework capable of incorporating various likelihood-based causal structure learning algorithms, offering a more comprehensive and accurate approach to understanding causal connections in diverse datasets.