# (Very) large clusters

Dear Mike!

Hope you are feeling fantastic!

I am analysing a time-frequency dataset which contains 2 categorical and 3 scalar independent variables (which are all manipulated within subject) using cluster-based permutation tests (using field trip; looking at all channel, times, and frequency 4-30Hz). For the scalar variables, I’m simply computing the linear coefficient for the power across 10 bins of each scalar variable, and perform the permutation test comparing the linear coefficient maps with maps with zeros (sorry for a very brief explanation here, happy to elaborate further; I’ve taken this method from https://www.jneurosci.org/content/35/4/1458).

However, for some comparisons the significant clusters I find are very large and difficult to interpret…

Here are options that I’ve thought about.

• Setting the cluster alpha to a smaller value which sometimes results in smaller cluster that are more interpretable. The problem with this is that if I apply the lower cluster alpha to all comparisons, then some of the meaningful clusters are not detected, and, alternatively, it seems inconsistent if I only apply this lower cluster alpha to some comparisons only (how would I justify using different cluster alphas?).

• I could change the time interval that I’m looking at. The outcome here is mixed, sometimes the clusters are smaller and more interpretable, however, this also results in some of the ‘meaningful’ clusters falling outside this time window in other comparisons.

I wonder if you could help me find better ways to approach this.

Best

Nareg

Hi Nareg. If the clusters are real (that is, not due to an artifact in the data or bug in the code), then I don’t think you need to try to play around with the threshold. The data are what the data are, and if there is a large effect, then that’s what there is. I understand the point you’re making, but playing around with statistical thresholds to get a result that you’re looking for gets into dangerous statistical territory.

Perhaps an alternative is to show multiple statistical thresholds. For example, you could outline the p<.05 regions using a black contour line, and the p<.01 regions using a red contour line (or solid vs dashed lines…). That would allow you to illustrate various levels of significances while avoiding the difficulties of adapting the threshold for different analyses.

1 Like

Thank you for your response Mike.
I will try overlaying clusters with the different cluster thresholds.