Signal to noise ratio

Does anyone know after cleaning EEG signal how can I compute the signal-to-noise ratio for each subject?

Hi Nasrin. If you have a copy of my ANTS book, I discuss it there, I think in a few different places (e.g., for ERPs and for spectral power).

In general, an empirical SNR is computed as a mean divided by a standard deviation. So you can pick a time, frequency, or time-frequency window, and then segment the data or use trials, and then implement that SNR ratio.

Hi Mike,
Thank you very much for your reply. I reviewed your book for SNR. I want to clean my experiment data and I don’t have any experience in EEG signal preprocessing. Just to make sure are these steps ok with you? Thank you so much!
I have epoched data and 8 channels. I work in the command line, not GUI.

  1. I cleaned flatlines channels and interpolate bad channels.
  2. I re-referenced the data to average.
  3. I ran the first ICA.
  4. I rejected trials using functions pop_eegthresh, pop_rejtrend, pop_rejkurt, pop_rejspec…
  5. I ran the second ICA and then by applying ICLabel I kept components with more than 0.70 brain elements.
  6. And for frequency analysis, I computed a time-frequency decomposition for all electrode using FFT and then Morlet wavelet convolution (the same as your lectures)
  7. Now I want to calculate SNR.

Are these steps ok? and at which step should I perform subtractive baseline correction?

Thank you for your help!

Data cleaning is often very idiosyncratic to the type of data, purpose of the research, and preferences of the researcher. So here are my thoughts, but take them as they are – my thoughts – not factual instructions for what you need to do.

In general, there is a fine balance with cleaning data – every time you remove “noise” you’re also removing signal. The more “noise” you try to remove, the more signal you’ll also lose. My philosophy is to remove as little information as possible but as much as necessary. To my taste, you’re over-cleaning the data, which risks removing too much signal. As I wrote above, different people clean data in different ways.

  • I’m usually reluctant to do much algorithmic data (I prefer visual-based rejection), but that’s a matter of personal preference.
  • With only 8 channels, I wonder whether channel interpolation is useful. Also, it’s usually recommended to run ICA on data before interpolation (but without those channels in the ICA). Interpolating channels leads to reduced-rank matrices, which isn’t such a big deal for larger matrices, but going from rank of 8 to 7 is a relatively big hit.
  • I do trial rejection first and then ICA. Why expose the ICA to noisy data? It won’t improve the quality of the decomposition.
  • I do very occasionally run ICA twice, but only when necessary.

The rest of your pipeline sounds fine. Subtractive baseline correction is done when segmenting the data into trials, and is only used when you have a trial-based design.

Thank you so much Mike. So for a subject like the one, I attached here and has two or three flat channels what would you recommend? Just removing flat ones, then trial rejections, and finally ICA and ICLabel (without interpolating). Am I right?

Yes, that’s exactly what I’d do. Channel interpolation is only necessary if you want to average topographical maps across different subjects, and you can do that after the data cleaning (or after the analyses).

Can we calcualte SNR as mean over sd also for raw EEG data, time domain only?

Yes, you can. SNR is a generic formula, and is widely applicable.

1 Like

Hi Mike,

May I kindly ask you to share the code for “the peak of a trial-averaged component (e.g., the P3) compared to the temporal variance during the baseline period (e.g., – 200 ms to trial onset)”. Mine doesn’t work. I have 480 trials/epochs and baseline [-150 0].

Hi Nasrin. I’m always happy to share code, but you’ll need to give me a bit more information about this. Are you quoting from my book? Which figure or chapter is this, and is it not code already shared on my website?

Also, what do you mean that your code doesn’t work?

Hi Mike,

Thank you very much for your response. As I’m new to EEG signal preprocessing. This is from page 235, chapter 18. I haven’t found find this code anywhere. I have 9 subjects and 8 channels with 480 trials/epochs (x.min 0.15 x.max 1.298) after removing bad channels, rejecting artifacts, performing ICA, ICLabel, and removing components with less than 70 percent brain elements now I want to calculate signal to noise ratio. I don’t know which formula I should use and what would be the code. I would greatly be appreciated if you help me. Thank you in advance.

I see. I don’t think I have ready-made code for that, or maybe I did when I wrote the book.

I’m not sure what you mean by removing ICs with “less than 70% brain elements” but keep in mind that most ICs mix signal and noise, and so you generally want to remove as few components as possible – ideally just 1 corresponding to the blinks. Anatomical projections are also quite imperfect and depend on a number of parameters that are difficult to specify, and on assumptions that are questionable at best. Therefore, I would not use a dipole projection as a criterion to remove ICs.

About the SNR computation: You can break this down into a few steps:

  1. Identify the time windows to use for the ERP (e.g., 300-400 ms, or something reasonable based on your data and/or other similar studies on the P3) and for the baseline time window (e.g., -200 to 0 ms). The code would look something like
    basetimeidx = dsearchn(EEG.times’,[-200 0]’);
    ERPtimeidx = dsearchn(EEG.times’,[300 400]’);

  2. Compute the average of the ERP within your time window. The code would be something like
    erpmean = mean(ERP(ERPtimeidx(1):ERPtimeidx(2));

  3. Compute the standard deviation of the pre-stim period:
    basestd = std(ERP(baseidx(1):baseidx(2));

  4. Take their ratio:
    SNR = erpmean / basestd;

Keep in mind that this is not a black-box solution; it’s code to help you get started. You’ll need to modify as appropriate to your data and variable names, etc.

Hope that helps!

Hi Mike,

In one of my experiments I have 80 trials of an event. I am calculating the mean over the first 10 trials, 20 trials and so on up to 80 trials (the maximum number of trials available). I am calculating the SNR as mean/sd for each case.
At the same time, I am performing a transformation to the data. Don’t want to go into details here because is irrelevant for this topic but let’s say that I am performing the same modification to all 80 trials. Next I am computing the mean for 10 trials, 20 trials up to 80 trials as I did for the original data.
When I calculate the SNR for each case, the SNR values are larger than their respective value obtained by averaging only the original data as we did in the first step. The general shape of the ERP is preserved but in the second case the graph is smoother.
My question is: How do we interpret these results? What does this mean for the method that I used to transform the data (the fact that the SNR is increasing while preserving the shape)?

That sounds plausible to me. SNR is a nonlinear transform, so the average of SNRs is not the same thing as the SNR of an average. Plus, any additional transformations you applied may have boosted signal and/or suppressed noise.

1 Like

Thanks for the reply. One more qeustion: How about the cases when the mean is negative? This will force the SNR to have a negative value as well. Should we take its absolute value or should we keep the negative value?


I discuss this in more detail in my book, but SNR is a tricky computation when you have signed data values, because the signal quality can be very high but SNR=0 if the data are mean-centered.

1 Like

My data are not mean-centered but in some cases, as mentioned, the mean has a negative sign, that leads to negative SNR. My question is when we are comparing 2 negative SNR values which one is better than the other? Do we have to compare their absolute vlaues or we should keep the negative sign? I see that the absolute vlaue can be used in other fields but I am not sure if this is the case in neuroscience. Thanks!

The difficulty of using signed values for SNR is that a dataset of, e.g., [-10 +10] might be a very strong signal but it’s average – and therefore also the numerator of the SNR ratio – is zero.

On the one hand, you can simply take the absolute value to ignore the sign issue. But absolute value will also change the variance and possibly the interpretation of the mean, so it might not necessarily be an easily interpretable result.

1 Like