Group level strategy 2a questions

Hi Mike,

I hope you’ve been well during these strange times. This is a long one so buckle up! I had some questions about applying the group-level strategy 2a to dB baseline-normalized LFP power from correct trials during which 2 monkeys (Monkey A and B) performed a delayed oculomotor match to sample task testing visual working memory.

First, here is some background: Time-frequency analysis was implemented by convolving the LFP signal with a set of complex Morlet wavelets, defined as complex sine waves tapered by a Gaussian. The frequencies of the wavelets ranged from 4 Hz to 100 Hz in 35 logarithmically spaced steps. The full-width at half-maximum (FWHM) ranged from 400 ms to 100 ms with increasing wavelet peak frequency. This resulted in a spectral FWHM range of 2 Hz to 8 Hz.

The trial-averaged channel activity from each day’s recording session will represent a single subject.

Elsewhere, I have developed a qualitative model of monkey visual working memory based on previous studies that highlights a functional role for 3 main frequency bands: low, mid, and high. The boundaries of these bands varied by monkey. So I estimated the following for the model: 1-15Hz for low, 12-35Hz for mid and 35Hz+ for high.

In this work, I am trying to test whether 3 windows corresponding to each band has increased or decreased dB baseline-normalized LFP power in frontal and parietal regions for low and mid bands and frontal regions for the high band. I assumed 3 single sample directional t-tests on the correct trial-averaged channels corrected for multiple comparisons by Bonferonni would be acceptable (p = 0.05 / 3 ~= 0.017). This is done separately for each monkey.


  1. When picking the window of interest for strategy 2a, the time scale is dictated by the model. So, each window ranges across the delay period of the task. However, I am not sure of the optimal approach to choosing the boundaries for the frequency bands. When observing the correct data averaged over subjects (channels) across all days, it’s clear that Monkey B falls into the average I estimated for the model: low (4-12Hz), mid (15-32Hz), and high (38-90Hz) (see Monkey B figure). However, Monkey A’s band boundaries are higher: low (4-18Hz), mid (22-38Hz), and high (47-90Hz)(see Monkey A figure). Furthermore, I don’t think I can base the frequency band boundaries on these plots because they illustrate what I’m testing (double dipping). Conversely, if I choose the estimated bands for the tests, it’s likely Monkey A’s windows will be incorrectly labeled as insignificant (Type II error), but Monkey B will be accurate. What would be a good approach here for defining the boundaries of the bands? Should I try using error trials only since I’m not testing those and they should be orthogonal? Or some other approach?

  2. When choosing the frequency band boundaries for low, mid, and high, I think they need to be sufficiently spaced to prevent leakage from neighboring bands due to spectral smoothing introduced during convolution. Is this correct? If so, would it be appropriate to make the distance between two boundary components at least as large as their respective empirical FWHMs? Or half of their FWHMs?

  3. One of the assumptions of the t-test is that the source data are independent. Averaging the data within windows that are sufficiently spaced along the time and frequency axes should ensure this is maintained within subjects. However, what about correlation between subjects? In my case, the subjects are channels. Within a day’s recording session, it’s likely that channels close to each other contain correlated neural activity. How would you suggest I address this?

Thanks for your time and assistance!


PS: the forum is only allowing me to upload one image since I’m a new user. I can provide monkey B’s figure as well if needed.

Hi Bryan. Your main question is about individual differences in frequency ranges. It’s funny timing, because I just published a paper on exactly this topic. You can get the paper and code here:

But I wouldn’t say you need to take that approach; I’m also a big fan of visually guided analyses, as long as you make sure they are not done in a biased on circular method. A few ideas come to mind:

  1. Try the gedBounds method I linked above. It requires multichannel data, which it sounds like you have. I’d be happy to help you implement that method if you want.
  2. Pick the TF windows for each animal, first averaging over all conditions and channels (or at least, all relevant channels).
  3. Instead of picking frequency ranges, pick time ranges and then average the power data over that time range. That will give you a spectrum from 4-100 Hz (per condition/channel). You can then do a t-test or one-way ANOVA at each frequency (correcting for multiple comparisons). Then the hypothesis is that you will find different statistical results at three frequency ranges (low/mid/high) without having to specify a priori the exact boundaries.

For your second question: Yeah, that depends on how you setup the analysis. If you use hand-cut neighboring frequencies, leakage could decrease the sensitivity of the results. I think spacing the frequency bands by a few Hz would be fine. You could do it algorithmically, for example, having the upper bound be 1 FWHM away from the lower bound of the next range. If you try #3 above, then it doesn’t matter. But I wouldn’t be too concerned about this. There will only be a small amount of leakage near the frequency boundaries, not in the bulk of the ranges.

For your third question: How many channels and how closely spaced are they? Correlation across channels is actually useful information that you can leverage to boost SNR while also reducing data dimensionality. My approach would be to start from a components analysis to create one component per recording session. You can see an example of this here:

Otherwise, the problem with correlated sampling isn’t so much about the validity of the t-test, it’s about the generalizability. Your assumption here is that you are randomly sampling from circuits in one animal, so the appropriate generalization is to neural populations in that monkey (and then for each monkey).

Thank you Mike. This is excellent and quite serendipitous that you just published on the topic.

First to answer your question:

Two microdrive arrays were used in frontal and parietal regions. Monkey A’s microdrives contained 8 channels (Gray, Goodell and Lear, 2007) and Monkey B’s was a later, more improved model containing 32 channels. The total number of channels varied by day based on the quality of data recorded. The technicians got better as time went on. So, the earlier days only had 3 or 4 channels of data, while the later days would have a max of about 8 channels in Monkey A and 23 channels in Monkey B. Based on figures in the 2007 paper, it looks like the 8 channel model’s electrodes were spaced about 10mm apart. Based on the website for the newer 32 channel model, the electrodes are only 0.8mm-1.5mm apart.

Second, in response to your suggestions:

The recommendation to use GED is quite appropriate and interesting! However, based on my review of the two papers you provided and a quick glance at the GitHub code, I don’t think my timeframe permits me to learn the technique right now. This is for a chapter in my dissertation which I have to finish next week. So, I think I will go with your second recommendation which will address the double-dipping. I would like to try the GED approach at a later time when I can focus on the details and would most likely reach back out for clarifications.

Separately, regarding the correlation across channels, what do you think about just averaging the dB baseline-normalized power over all frontal channels and all parietal channels for each session? This would leave me with two signals (subjects): a frontal and parietal for each day. Then I can pick the windows using your second recommendation and conduct my t-tests. The df will go down significantly, but there will no longer be the question of correlated neural activity since frontal and parietal cortices are regionally distinct and separated by a long range. It also fits the hypotheses better because they are about regional (frontal or parietal) activity, not areal/channels. Again, I would still like to apply the GED to go the component route at a later date when more time is available.

Thanks for your valuable input!

I see, then sticking with the electrode-level analyses is best. There are always new data analysis methods coming out, and the question is whether the methods you are using are incorrect vs. whether some fancier method could possibly give better results after a month of testing (the answer might be negative – new methods are not guaranteed to be better). So yeah, focus on your dissertation for now :wink:

Averaging over the channels is a good idea. In fact, channel averaging is already a spatial filter, where all of the weights are 1/M. Just be mindful to exclude any channels that are dead, particularly noisy, or otherwise too difficult from the rest to be averaged (e.g., if one channel is in a different brain region or in the white matter).

Success with the dissertation!

1 Like