Keras LSTM time-series regression prediction, how to use different datasets in the network


I am using Keras LSTM to predict the future target values (a regression problem and not classification). My current dataset has 270 rows, starting on t_0 and finishing on t_269, each row includes the current target value (the value I want to predict) and other 6 features at that time.

My goal is to predict how is the target value going to evolve for the next time step.

First of all i normalized the data using the Normalizer Node (Min-Max Normalization between 0 and 1).
I created the lags for the 7 columns (target and the other 6 features) making 14 lags for each with 1 as lag interval. I then used the column aggregator node to create a list containing the 98 values (14 lags x 7 features). partitioned in training and test sets and sent it to my Keras network.

So far my network is made from:

  • Keras Input Layer with input shape 1,98
  • Keras LSTM Layer, Hard Sigmoid Activation, 20 units
  • Keras Dense Layer, Linear activation function

Then for the Keras Network Learner I am using 50 epochs and 32 as batch size with ADAM optimizer. And I am not shuffling the data before each epoch because I would like the LSTM to find dependencies between the sequences.

I am still trying to tune the Network using maybe different optimizer and activation functions and considering different number of units for the LSTM layer.

Right now I am using only one dataset of many that are available, for the same experiment but conducted in different locations. Basically I have other datasets with 270 rows and 7 columns (target column and 6 features). What I would like to do is to use the other datasets ,let’s say other 5 datasets, to help train my network.

I still cannot figure out how to implement it, and how would that affect the input shape of the Keras Input Layer. Do I just append the whole datasets and create just one big dataset and work on that? but wouldn’t this make the LSTM lose the sequential factor? Or is it enough to set the batch size of the Keras Network Learner to the number of rows provided by each dataset?

PS. ultimately I will be forecasting the next 10 values for each dataset, because i know the next 10 values of each one of the 6 features.

I hope I was clear enough explaining my problem.

Hi @adhamj90,

welcome to the KNIME forum :slight_smile:

Sounds like an interesting use case you are working on!

Do I understand your problem correctly, that you want to predict the next value of the target column based on the last 14 values in the target column and the input features?

If yes, you might want to change the input shape to make use of the sequential factor. If I understand your use case correctly, I think the input shape should be “14, 7”. 14 time steps and for each time step you have a 7 dimensional input for your LSTM.
Does this makes sense?

Having this in mind, you can create training samples for each of your datasets and concatenate the different training samples.


PS: If you want to predict multiple time steps, you can use a recursive loop to use the predicted values to predict the next one.

1 Like

Hey Kathrin,
First of all thanks for you reply and for the welcome :slight_smile:

Exactly! each row contains 8 columns: a date, target value, and other 6 features. and each dataset has 270 rows.

your answer says that I’ve understood the input shape incorrectly!
I’ve found a couple of workflows using Keras LSTM layers and the input shape in both was “1, x” where x is the number of elements in the list created using the Column Aggregator Node (aggregating the lagged values and other features), that’s why I used “1,98” as an input shape.

Could you please be more specific about how to reshape the data to be “14,7”? how should I use the Lag Column and Column Aggregator nodes exactly?

Unfortunately, I’ve never worked on a multivariate time series use case. So I’m not 100% sure how the data has to be transformed in KNIME. I have an idea and will try it out tomorrow.

do you mean by “14,7” that i don’t have to aggregate the columns? so I would have a table containing all 98 columns as input?
Is there a difference between doing this and creating a list of 98 elements and using a shape “1,98”?

If so, I still do not know exactly what to do with the other datasets. Do i simply append them to the input table of the learner? How can I tell the Network that a current row is, for example, at the beginning of the sequence? Is it something I can achieve by the batch size option in the Keras Network Learner node?

Thank you so much for your time. I really appreciate it!

Thanks again! I’m looking forward to your reply :slightly_smiling_face: :slightly_smiling_face:

Hi @adhamj90,

recurrent neural networks and LSTMs are confusing in the beginning and I hope I can help you to understand things a bit better. I will try to answer your questions first. Afterwards I will let you know how you can structure your data to feed it into the network correctly.

Yes, there is a difference. If you feed into an LSTM a sequence of shape “1,98” then you show all 98 input features to the LSTM layer at once and you don’t make use of the recurrence of LSTMs. If you feed into an LSTMs a sequence of “14, 7”, then your LSTM layer sees in the first copy of the LSTM unit only the 7 input features of the first time step. The next copy of the LSTM unit sees the 7 input features of the next time step and so on. Does this makes sense?

The different rows are used independently from each other during training. This means for each row the LSTM layer starts with new initialised hidden states (therefore you can also shuffle your data to avoid overfitting). This is something that confused me in the beginning as well. So you can create the training sequences for each data set and then concatenate them.

And now to the question how can we feed the data of shape “14,7” into the Keras Network Learner node?
One way is to feed a vector of size 14*7=98 into the network, where the values have to have a certain order.
Let’s make a smaller example for this. Let’s say we have a multivariate time series with 3 features (x, y, and z) and we use always 4 time steps to make the prediction. In this case the input shape must be “4,3” and the input matrix should look like this.

x(t-4) x(t-3) x(t-2) x(t-1)
y(t-4) y(t-3) y(t-2) y(t-1)
z(t-4) z(t-3) z(t-2) z(t-1)

In KNIME we therefore have to create collection cell where the values appear in the following order
x(t-4), x(t-3), x(t-2), x(t-1), y(t-4), y(t-3), y(t-2) , y(t-1), z(t-4), z(t-3), z(t-2), z(t-1)

If you define the Input shape “3,4” in the Keras Input Layer node, KNIME Analytics Platform will transform the collection automatically in the correct table.



Hi @adhamj90,

thank you for sharing your interesting use case!

It motivated me to build a small example workflow for multivariate time series analysis using the London bike sharing dataset from Kaggle.

I hope this helps you

PS: To execute the workflow you need to download the Kaggle bike sharing dataset from Kaggle


Hello Kathrin,

Thanks a lot for your explanation.

This part made things clearer to me for how KNIME handles the input shape.

Hi Kathrin,

I tried to run your workflow (“Multivariate_time_Series_Predictions-kathrin”) but I failed. It’s as if there’s
an infinite loop at the Keras network learner node. I reduced the size of the dataset but the problem remains. Do you know what may cause this issue ? Thanks. Stephane