I may have stumbled on the same bug Siemen and Dominic brought up during the STARBUCKS meetings and decided to post it here.
The bug shows up when the following two setups are used together in the prewhitening function iterative_prewhitening:
- prewhiten frequencies in an order that follows the SNR, i.e., parameter
prewhiteningorder_snr=True.
- stop the prewhitening when the SNR drops below a certain value, i.e., use of the function
stopcrit_scargle_snr.
Example
The best way to visualize this issue is to print the SNR value at each step of the prewhitening by inserting the line of code
print(f'SNR = {(ampls/noises)[np.argmax(ampls/noises)]:.3f}') ### NEW LINE ###
below line 276 in find_frequency (which is called by iterative_prewhitening). The change should look like
noises_old = np.copy(noises)
else:
noises = np.interp(freqs,freqs_old,noises_old)
frequency = freqs[np.argmax(ampls/noises)] ### LINE 276 ###
print(f'SNR = {(ampls/noises)[np.argmax(ampls/noises)]:.3f}') ### NEW LINE ###
else:
frequency = freqs[np.argmax(ampls)]
if full_output and counter==0:
freqs_,ampls_ = freqs,ampls
Now, load a light curve of choice or use the one attached here:
import pandas as pd
lc = pd.read_csv('lc_tic374944608.csv')
time = lc.time
signal = lc.flux
Finally, run the code below:
import ivs.timeseries.freqanalyse as fa
SNR_limit = 4.0
SNR_window = 1.0
StopCrit = (fa.stopcrit_scargle_snr, SNR_limit, SNR_window)
params = fa.iterative_prewhitening(time, signal, prewhiteningorder_snr=True, stopcrit=StopCrit, maxiter=200)
This will print the SNR of the frequencies at every stage of the prewhitening (including the process of refining the frequency grid). The console output will look similar to
SNR = 22.626
SNR = 22.656
SNR = 22.660
SNR = 22.660
SNR = 13.446
SNR = 13.474
SNR = 13.474
SNR = 13.474
SNR = 14.645
SNR = 14.660
SNR = 14.661
SNR = 14.661
SNR = 13.013
SNR = 13.013
...
SNR = 4.046
SNR = 4.046
SNR = 3.892
SNR = 3.892
SNR = 3.893
...
SNR = 3.369
SNR = 3.370
SNR = 3.370
SNR = 3.370
SNR = 3.380
SNR = 3.380
SNR = 3.380
SNR = 3.380
SNR = 3.352
SNR = 3.366
SNR = 3.366
This shows:
- the SNR can go below the aimed SNR=4 and
- the SNR has no relation with the value stored in
params['stopcrit']
>>> params['stopcrit']
array([ 13.40474348, 13.44678306, 14.64631962, 12.61257799,
12.64298309, 13.81737578, 11.71527848, 11.73468068,
12.85877578, 14.37025633, 13.36801271, 10.43517399,
10.4358055 , 10.55250507, 10.57568668, 10.5923864 ,
8.45050698, 9.04302943, 8.0771531 , 8.12639612,
8.18179814, 8.28015034, 8.78057684, 7.34186741,
7.33615494, 7.31018856, 7.32047348, 7.32133006,
7.33448515, 7.77632555, 7.61053638, 7.59566627,
7.6029366 , 6.68654245, 6.68784436, 6.67135312,
6.68573519, 6.67526519, 6.7685099 , 6.9593664 ,
5.7687152 , 5.76961418, 5.77253784, 5.76632453,
5.76715021, 5.77237361, 5.77487442, 5.78587947,
5.79101015, 5.80199297, 5.80517593, 5.73848364,
5.94638784, 5.22299725, 5.242362 , 5.24254542,
5.23705819, 5.23723004, 5.24082174, 5.23803633,
5.24944535, 5.26240406, 5.2590147 , 5.2572157 ,
4.67118545, 4.67098396, 4.68356173, 4.67893452,
4.50099421, 4.5006101 , 4.28881645, 4.42815387,
4.43569177, 4.43484652, 4.02039813, 4.02140911,
4.02093853, 4.02246461, 4.01459534, 4.01289055,
4.00897009, 4.01626374, 4.01684946, 4.02571564,
4.02576566, 4.03315131, 4.03151657, 4.03357364,
4.01409774, 4.02742649, 4.03630451, 4.0399484 ,
4.03936324, 4.04654839, 4.0483281 , 4.04889985,
4.05170285, 4.03186587, 4.034091 , 4.02836794,
4.02060923, 4.02102296, 4.01439998, 4.03126013,
4.02632311, 4.02690732, 4.03358535, 4.03667749,
4.0314516 , 4.03188031, 4.0314285 , 4.0306712 ,
4.03018952, 4.0351546 , 4.04085326, 4.03599139,
4.03327408, 4.03159533, 4.03190789, 4.03191836,
4.02406664, 4.0326358 , 4.12642581, 3.77142806])
>>> params['freq'].size
124
Explanation
The function stopcrit_scargle_snr below is meant to return True or False depending on whether the stopping criterion is met or not. As the function's name suggests, the halt happens when the SNR value drops below crit_value and the function returns the tuple(True,value).
def stopcrit_scargle_snr(times,signal,modelfunc,allparams,pergram,crit_value,width=6.):
"""
Stop criterium based on signal-to-noise ratio.
"""
width = width/2.
argmax = np.argmax(pergram[1]) #1
ampls = pergram[1]
start = max(0,pergram[0][argmax]-width)
stop = min(pergram[0][argmax]+width,pergram[0][-1])
if start==0:
stop += width-pergram[0][argmax]
if stop==pergram[0][-1]:
start = pergram[0][-1]-pergram[0][argmax]+width
ampls = ampls[(start<=pergram[0]) & (pergram[0]<=stop)]
value = pergram[1][argmax]/ampls.mean()
return value<crit_value,value
The problem is, at each iteration, the frequency ν returned by iterative_prewhitening(prewhiteningorder_snr=True) is such that maximizes the SNR in the periodogram, while the frequency ν' used by stopcrit_scargle_snr to compute the SNR is such that maximizes the amplitude in the periodogram (see #1 above).
This implies that ν is not always equal to ν' and, therefore, the SNR in the output does not necessarily correspond to the associated frequency in the same output. For instance, the frequencies stored in params['freq'] in the example above do not necessarily relate to the SNR stored in params['stopcrit'].
Solution
We have to modify the function stopcrit_scargle_snr so that it calculates the SNR of the frequency selected by the prewhitening process and not always default to the frequency with the highest periodogram's amplitude.
Below is an example of such new code:
def NEW_stopcrit_scargle_snr(times,signal,modelfunc,allparams,pergram,crit_value,width=6.):
"""
Stop criterium based on signal-to-noise ratio.
"""
width = width/2.
freqs = pergram[0] # frequency grid
ampls = pergram[1] # periodogram's amplitudes
freq = allparams['freq'][-1] # Last frequency in the iterative prewhitening
ind = np.abs(freqs-freq).argmin() # Get grid's indice closest to the frequency
# Clip the frequency grid
start = max(0,freqs[ind]-width)
stop = min(freqs[ind]+width,freqs[-1])
if start==0:
stop += width-freqs[ind]
if stop==freqs[-1]:
start = freqs[-1]-freqs[ind]+width
clipped_ampls = ampls[(start<=freqs) & (freqs<=stop)]
value = ampls[ind]/clipped_ampls.mean()
return value<crit_value,value
Testing the solution
We will just repeat the code in section Example using NEW_stopcrit_scargle_snr instead of stopcrit_scargle_snr.
import ivs.timeseries.freqanalyse as fa
SNR_limit = 4.0
SNR_window = 1.0
NEW_StopCrit = (NEW_stopcrit_scargle_snr, SNR_limit, SNR_window)
params = fa.iterative_prewhitening(time, signal, prewhiteningorder_snr=True, stopcrit=NEW_StopCrit, maxiter=200)
the console output will look similar to
SNR = 22.626
SNR = 22.656
SNR = 22.660
SNR = 22.660
SNR = 13.446
SNR = 13.474
SNR = 13.474
SNR = 13.474
SNR = 14.645
SNR = 14.660
SNR = 14.661
SNR = 14.661
SNR = 13.013
SNR = 13.013
...
SNR = 4.046
SNR = 4.046
SNR = 3.892
SNR = 3.892
SNR = 3.893
This shows:
- the SNR does not go below the aimed SNR=4 but halts after crossing that threshold and
- the SNR are the same values stored in
params['stopcrit']
>>> params['stopcrit']
array([ 22.62772635, 13.44678306, 14.64631962, 13.0099008 ,
12.64298309, 13.81737578, 11.95378747, 11.73468068,
12.85877578, 14.37025633, 13.36801271, 11.49104876,
11.34068947, 10.75950652, 10.59325395, 10.5923864 ,
9.68459626, 9.04302943, 8.81164341, 9.03136256,
8.81812226, 9.14385704, 8.78057684, 8.55852621,
8.38600352, 8.1462341 , 8.13629635, 7.9063102 ,
7.82175271, 7.77632555, 7.68812351, 7.96437803,
7.6029366 , 7.4474554 , 7.0331079 , 6.83559477,
6.77586296, 6.67526519, 6.7685099 , 6.9593664 ,
6.67973248, 6.4832323 , 6.14747587, 6.39048214,
6.20770705, 6.01226832, 5.84869113, 5.82001952,
5.83489705, 6.07535406, 5.80517593, 5.73848364,
5.94638784, 5.66314269, 5.5250279 , 5.51133535,
5.41815014, 5.2881503 , 5.14268841, 4.8694191 ,
4.85398674, 4.8750638 , 4.78881205, 5.2572157 ,
4.76446398, 4.68501405, 4.73088918, 4.67893452,
4.63948586, 4.5006101 , 4.48703331, 4.48560261,
4.46868011, 4.43484652, 4.23184198, 4.19014372,
4.14076642, 4.08372717, 4.0494421 , 4.08648646,
4.14701003, 4.15841565, 4.00241717, 4.30104471,
4.01527914, 4.02158804, 4.03000781, 3.89200165])
>>> params['freq'].size
88
Note as well that this time the prewhitening returns 88 frequencies instead of 124.
Final note <--------------------------
Even if the parameter prewhiteningorder_snr=True is used, the extracted frequencies will not necessarily come sorted by SNR, as it can be seen in the output for params['stopcrit'] above. I think this happens because the noise level is calculated with the residuals after each iteration, meaning that subsequent frequencies may have their SNR increased because the windows function of the previous frequency has been removed.
Useful links
- Source code
iterative_prewhitening
- Source code
stopcrit_scargle_snr
lc_tic374944608.zip
I may have stumbled on the same bug Siemen and Dominic brought up during the STARBUCKS meetings and decided to post it here.
The bug shows up when the following two setups are used together in the prewhitening function
iterative_prewhitening:prewhiteningorder_snr=True.stopcrit_scargle_snr.Example
The best way to visualize this issue is to print the SNR value at each step of the prewhitening by inserting the line of code
below line 276 in
find_frequency(which is called byiterative_prewhitening). The change should look likeNow, load a light curve of choice or use the one attached here:
Finally, run the code below:
This will print the SNR of the frequencies at every stage of the prewhitening (including the process of refining the frequency grid). The console output will look similar to
This shows:
params['stopcrit']Explanation
The function
stopcrit_scargle_snrbelow is meant to returnTrueorFalsedepending on whether the stopping criterion is met or not. As the function's name suggests, the halt happens when the SNRvaluedrops belowcrit_valueand the function returns the tuple(True,value).The problem is, at each iteration, the frequency ν returned by
iterative_prewhitening(prewhiteningorder_snr=True)is such that maximizes the SNR in the periodogram, while the frequency ν' used bystopcrit_scargle_snrto compute the SNR is such that maximizes the amplitude in the periodogram (see#1above).This implies that ν is not always equal to ν' and, therefore, the SNR in the output does not necessarily correspond to the associated frequency in the same output. For instance, the frequencies stored in
params['freq']in the example above do not necessarily relate to the SNR stored inparams['stopcrit'].Solution
We have to modify the function
stopcrit_scargle_snrso that it calculates the SNR of the frequency selected by the prewhitening process and not always default to the frequency with the highest periodogram's amplitude.Below is an example of such new code:
Testing the solution
We will just repeat the code in section Example using
NEW_stopcrit_scargle_snrinstead ofstopcrit_scargle_snr.the console output will look similar to
This shows:
params['stopcrit']Note as well that this time the prewhitening returns 88 frequencies instead of 124.
Final note <--------------------------
Even if the parameter
prewhiteningorder_snr=Trueis used, the extracted frequencies will not necessarily come sorted by SNR, as it can be seen in the output forparams['stopcrit']above. I think this happens because the noise level is calculated with the residuals after each iteration, meaning that subsequent frequencies may have their SNR increased because the windows function of the previous frequency has been removed.Useful links
iterative_prewhiteningstopcrit_scargle_snrlc_tic374944608.zip