Recursive Random Forest Forecast: Lag Columns not updating correctly

Hi everyone,

I’m working on a recursive time series forecasting workflow in KNIME using a Tree Ensemble Learner and Predictor (Random Forest model). The goal is to predict weekly sales per installer (Installer_ID) based on lagged values of previous sales (Sales(-1), (-2), (-3)), along with some external features (e.g., X_CSI, X_ENERGY_PRICE_INDEX).

My workflow includes:

  • Group Loop to process each Installer_ID separately
  • Recursive Loop to generate a multi-step forecast per installer
  • Tree Ensemble Predictor that is connected via model port (right input)
  • Lag Column nodes to shift the previous predictions into new lag positions (Prediction → Sales(-1) → Sales(-2) etc.)

Here’s the issue:
While the lag shifting using Lag Column works in the first iteration, from iteration 2 onwards, the workflow keeps repeating the same lag values and produces the same prediction repeatedly.
It looks like the most recent prediction is not actually inserted as a new Sales(-1) value, or it’s not carried forward correctly within the recursive loop.

I’ve tried renaming the columns, removing missing values, and filtering only the last row – but it doesn’t seem to dynamically update the lag structure as expected.

My goal is to update the lag values like this:

  • Prediction from current iteration → becomes Sales(-1)
  • Previous Sales(-1) → becomes Sales(-2)
  • Previous Sales(-2) → becomes Sales(-3)
    … and use these as input for the next iteration.

Does anyone know how I can ensure that the new prediction is correctly used as input in the next iteration, when using only Lag Column nodes?

I’m happy to share the full workflow if helpful.
Random_Forest_Recursive_Loop.knwf (926.8 KB)

Thanks a lot in advance!
CSI_Energy.xlsx (24.2 KB)

Hi KNIME Team,

Can you help me with the above mentioned issue on the recursive loop in RF?

Hi,

the initial post is 2 years old. Did you had a chance to work on it? Tbh I didn’t got a clue what you are trying to do.
But in general if you are using an recursive loop you have to take care what data is send back to the beginning of the loop for the next iteration. Thats why the recursive loop end as two inputs:
first → result of the current interation which is collected
second → table/data which is send back to the begin of the loop