# Automate the lag node to find the optimal lag value through correlation value

Hi,
is there a way to find the best lag value between 2 correlated time series variables without going into a loop ?
Or did someone create already this kind of node ?

Thanks a lot,

Hello @boggy62

Welcome to the KNIME community!

Is there any chance that you could upload a small workflow with an example of your time series data and even better, with what you have tried to do so far with loops ? This would help other knimers a lot to try to implement for you a solution without loops, if this is eventually possible. I will try to help you from there too.

Otherwise, you could have a look at how the following component is implemented and achieves the FFT using a Python node internally :

After a bit of work from there, you could work out your two Time Series time delay using a classic FFT-based algorithm.

It may be that there is already a node or a workflow in KNIME capable of directly working out the Time Lag between series but I’m not aware of it. People like Daniele Tonini, Maarit Widmann and Corey Weisinger who wrote the folowing blog may have a better answer to your question:

Time Series Analysis with Components

Time Series Analysis Workshop

Hope all these few hints are of help to you.

All the best,

Ael

2 Likes

Hi,
this tis the kind of loop I try to use to find the optimal correlation value:

for i in range(-10,10):
data[“X_shift”] = data[“X”].shift(i)
mat_corr = data.corr()
print(i, “->”, mat_corr.loc[“X_shift”,“Y”])

then I can visualise the correlation and try to find a regular pattern

Hi @boggy62,

I think this can be done using the Parameter Optimization Loop Start and Parameter Optimization Loop End nodes. The parameter that is optimized is the ‘lag’ which ranges from -10 to 10. You will use this variable to control the Lag Column node and use the correlation score in the loop end as objective function value.

Cheers,
Simon

2 Likes

Hi,
tried to make my workflow but I am not able to select the correlation from correlation node into Parameter Optimization loop end. What did I do wrong ? I enclosed the Excel source test file (with optimized lag value of 5) and workflow.

Correlation Optimization.knwf (19.0 KB) test correlation 5 decalage.xlsx (42.0 KB)

Hi @boggy62,

you need to add a Table Row to Variable node in order to have the correlation value available as flow variable. Also, you need to set the column selection in the Correlation node as follows:

This ensures that always only the Data1 and the lagged column are included.

Here is the adapted workflow: Correlation Optimization_corrected.knwf (22.1 KB)

Hope this helps,
Simon

1 Like

Hi SimonS,
Yes it does help as I was stuck. Thanks a lot

2 Likes