Time Series Cross Validation

I am building a model with data that is ordered by date. Cross validation with time series data is a bit tricky as you don’t want to introduce leakage and other strange effects by predicting past events with future ones etc. I want to use cross validation but I don’t think the X-partitioner can handle time series cross validation. I want to do something like this but not sure what the most efficient way would be in Knime:
fXZ6k

Let me know if you have any ideas on how to do this, I am open to other time series friendly methods too , I would like to avoid creating some kludgy manual looping workflow if possible but will do so if necessary.

Hey @bluenote, you’re right that there isn’t a prebuilt loop for cross validation with time series data. You can rig up a Counting Loop to do this pretty easily though. Here’s an example where I combine the Counting Loop with a Math Formula node to control the Partitioning node as you would for a cross validation like your graphic.

The X-aggregator then can just be replaced by a Numeric Scorer and a Loop End, although I also added a Table Transposer to clean up the output shape so I had a row per partition instead of a column.

Depending how your use case you’ll probably need to change around the body of the loop to get your scoring set up but the bulk of the workflow should apply to any cross validation problem. Hope this helps!

5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.