I’m trying to find out how stationarity (or lack of) impacts granger causality estimates for EEG data. I know it is generally not recommended to run GC analysis with data that is non-stationary but I am trying to find out more info on why. For example, if I compute GC estimates on non-stationary data could this lead to type 1 errors? if so exactly how is this happening. Could running GC analysis on non-stationary data systematically affect data in one condition more than the other which could give the appearance of condition differences?

Hi Ira. The thing about GC is that it isn’t actually computed the from data; it’s computed from the autocovariance matrix. So the GC result depends entirely on the quality of the autocovariance matrix. And the elements in the autocovariance matrix are statistical estimates from a model fit, not actual measured data points.

The point of all this is that GC depends on the ability to estimate parameters from data. If the data are excessively noisy, too variable, or too nonstationary, then the autoregressive coefficients cannot be accurately (reliably) estimated, which means the GC results are likely to be unreliable.

I’m not sure if the nonstationarity biases the results in a systematic way, but it is certainly adding uncertainty to the results.

So in terms of obtaining reliable GC estimates that you can be confident in (about your second point) - if lots of the data isn’t stationary after doing basic things like detrending and mean subtraction is it simply not appropriate to apply GC analysis? I know in your book you suggest not interpreting the data during such segments but this isn’t feasible if lots of the data isn’t stationary- is there anything else that can be done to improve the reliability? - also assuming that the data was thoroughly cleaned.

Yeah, that’s a good discussion point. In economics, people will often run statistical tests (e.g., a unity root test) to determine stationarity. But that’s with a very small number of time series. In neuroscience, we have 100s or thousands of data segments, so it’s really not feasible to test them all. (Actually, I tried this a long time ago out of curiosity… it took a while to run and most data segments were considered stationary.)

So the typical approach is to apply some appropriate precautions (z-score, detrend, avoid ERPs) and hope for the best. I think in many cases it’s probably not such a big deal. In general, most statistical analyses and estimation procedures are reasonably robust to minor violations of data assumptions.

Thanks! One other thing that makes me hesitant about the result is I’m finding for my data that there is little convergence between GC estimates derived from code similar to yours and from using functions from the MVGC toolbox. Using the sample data you provide with your book gives much more similar albeit not identical results between the two methods- suggesting that the problem is with my data rather than any of the coding.

Anyway, I am struggling to determine why this would be as I’ve plotted the VAR model fitted data for each segment against the real data and for both methods it looks like the model is fitting the data almost identically. I’m sorry if this is a tall order given you don’t have any of my data, but based on my description do you have any suggestions on why I might be failing to replicate results between methods for my data? Or perhaps more generally how I might check the stability of my model fit? I think there are some formal tests of this within the bsmart toolbox. I think also there is a ‘unit root’ test built into the mvgc toolbox - so i can certainly look at the outputs for that too, but I’m not sure how relevant these tests are for neuroscience data?

If you’re doing anything more than the basic GC analysis, I’d recommend going with the MVGC toolbox. They have more sophisticated methods than what I presented in the book, which was mainly to teach how GC works.

That said, it is a bit strange to get wildly different results with one dataset and very similar results with another dataset. Are you using the same parameters in both cases? And are you using multivariate GC (which I do not show in the book)?

The unit root test is for stationarity of the signal within each window.

Yes I was using the same parameters and yes I will be estimating multivariate GC with the toolbox so perhaps this explains the differences. Sorry I hadn’t realised they were separate measures!