Time series forecasting with MLP

Ag47 · February 9, 2016, 2:25pm

Hi,

as I understood forecasting (i.e., predicting future values of a given data) can be accomplished in two steps with KNIME:

Fitting a model to the available data: using the Learner and Predictor nodes on the split data (training and test sets)
Forecasting by means of the fitted model: either using some newly available data (in case of classification) or the already available data set (in case of regression)

Question is, in step 2 how the forecasting horizon (i.e., number of periods to forecast) could be set for the time series forecast in KNIME:

With ARIMA nodes it is straightforward as the ARIMA Predictor has this parameter (“number of periods to forecast”)
However, when MLP is used for time series forecasting the MLP Predictor node does not have any parameter which is related to the forecasting horizon. (Same goes for the Regression Predictor node in case of Linear or Polynomial regression)

I looked at the “Big Data, Smart Energy, and Predictive Analytics” white paper and the corresponding workflow (050010_Energy_Usage_Time_Series_Prediction) but could not find the answer.

Could anyone give some hint?

Thanks in advance!

ferry.abt · March 3, 2016, 12:00pm

Hello Ag47,

Forecasting is a little bit more complicated.

If you have some inputs, for example you want to predict the DAX index given the value of gold, oil and the dollar, you use regression. This unveils the relation between those inputs and your target, the DAX index (of course only if there is one ;-))
In this case you need of course the inputs for future dates to predict the index.

If you have only a target, for example you have the number of users on your website and you want to predict how many will visit your site in the next month, you have to generate your input.
Here I would order my training data (#users in the past) and assign a counter. To predict the next month I add 30 empty rows and let the counter continue there, so that the empty rows follow the training rows. Then let the Regresion Predictor do his job and you have a forecast.

But this is only very high level, not a detailed description.
Have a look at the workflows 001001 and 001002 to learn more about time series mining.

Also if you provide some more information about your project and your data I might be able to give more specific tipps or strategies.

Best,
Ferry

Ag47 · March 9, 2016, 1:50pm

Hi Ferry,

thanks for your reply!

I looked at the workflows 001001 and 001002, but those are about predicting the training and data sets, nothing about forecasting.

I have only the target variable (no other inputs) and I was not able to put together a workflow in KNIME that uses MLP. In R it is quite simple to use the neural network to do forecast, but I wanted to try out in KNIME as apparently MLP allows for more control over the neural network settings.

I am not sure I understood fully your proposal with the counter to forecast, so if you could give an example or point me to some existing workflow that would be really appreciated!

Btw, I have not seen any KNIME workflow that uses MLP to forecast, maybe I have not searched hard enough...

Thanks!

Ag47

ferry.abt · March 9, 2016, 2:46pm

Hello Ag47,

A Multilayer Perceptron is a neural network that maps a set of inputs onto a set of outputs. This is called prediction. You can see it as a mathematical function f(x,y,z) = (a,b,c).
So to forecast you need to give it some input.

Let's look at the problem as a MLP-function. You have given a set of values Y. What is the input? Probably the order. Your data is ordered by date/time, and you want to forecast the next couple of entries in this list.
So you have given f(0) - f(n) and you want f(n+1) - f(n+...). If you have not given your X as a date or something you need to generate an X. For example with the Counter Generator. Then you can feed your predictor with n+1 - n+... which is a forecast.
See the attached example.

If you have a date/time given, things become interesting. Because now you can split your date/time into the parts (year, month, week, day of week, day of year, etc) and give your MLP some real input, which it can use to detect the patterns. Then you just create the dates in the future, feed it into the predictor and it gives you a beautiful forecast.

I don't know much about R, but this is how I would do "forecasting" in KNIME.

If someone has a better solution feel free to step in.

mlp_forecasting.zip

Ag47 · March 9, 2016, 4:07pm

Hi Ferry,

Thanks so much for the example!

I think I understood your approach.

It seems you are using the whole available data set for training the model (from row 0 to row 4958) and the next 1000 empty rows as both the test and forecast, which means that you are not really able to assess how closely the MLP model fits your data. But I think it can be refined so that only some part will be used for testing, e.g. with the row splitter split like this:

1. training: (rows 0 - 4000)

2. test and forecast: (rows 4001 - 4958) + (empty rows 0 - 999)

So you can evaluate the MLP generated model by comparing the predicted results of rows 4001 - 4958 to their original values.

And also I think the training part can be controlled better using the Lag Column and Partitioning nodes.

(The lag is used for autoregression: https://www.otexts.org/fpp/9/3)

I will build a workflow and come back with the findings.

Thanks for you help!

Ag47

ferry.abt · March 10, 2016, 10:34am

Hi Ag47,

You are absolutely right, this was just a rough sketch to demonstrate my idea with the counter.

I am really excited and looking forward to hear about your results and maybe see your workflow!

Best,
Ferry

Ag47 · March 31, 2016, 5:25pm

Hi Ferry,

Started to tweak your workflow and found these:

The Denormalizer node has the output only in the [0,1] range, looks as if the re-transformation did not work (original data has values > 1 in many cases). I have seen this with other workflows as well.
Added the Lag Column node to create the autoregression scenario, as far as I know this is needed for time series forecasting when using neural networks. However, the MLP Predictor throws an error “Execute failed: Input DataTable should not contain missing values.” (The MLP Learner has the “Ignore Missing Values” option set). The situation seems similar to your original flow, i.e., the MLP Predictor gets missing values as input and there still can execute successfully. I tried to place the Lag Column node into different positions in the flow, but did not help.

Attached is the modified workflow, I would appreciate any help.

Thanks

Ag47

mlp_forecasting_with_lag.zip

ferry.abt · April 24, 2016, 3:11pm

Hey Ag47,

Sorry for the late reply.

The Denormalizer works just fine, but of course you have to apply it to the data. In your example you denormalize your predictions, but not your actual data. I changed that and now I get the exact input results (after normalizing and denormalizing).

The MLP Learner can ignore the Missing Values, the MLP Predictor (of course) not (as it needs inputs for the network to react to. Have a look at this book if you want to learn more about how all these work).
You need to insert the last predictions into the lag columns.

Attached is a modified version of you workflow, that uses the "based on the last predictions" approach of yours.

Best,
Ferry

mlp_forecasting_with_lag.zip

Ag47 · May 13, 2016, 6:54pm

Hi Ferry,

Thanks! That workflow looks a little bit complex, are you a KNIME power user? :)

After studying it let me ask 3 questions:

Is your last prediction approach a commonly used technique? I read about NNs but cannot remember that.
Can this be generalized to apply any lag value? As I see it only the lag needs to be set at the Lag Column node and the corresponding number of Rule Engine nodes.
How could you determine the proper network parameters (number of input neurons and hidden neurons/ layers) (As I understood the lag value determines the number of input neurons.) Can some kind of cross validation be used for that purpose?

Thanks for your support!

Ag47

ferry.abt · May 31, 2016, 1:41pm

Hello Ag47,

In that case you don't want to see my normal workflows :-P
I actually work for KNIME.

Actually I do not know how common that technique is, but it is the logical solution to the problem that a MLP needs an input.
Yes, the number of previous values is basically only limited by the number of training records. Probably the workflow needs some adjustment for more values, and it is definitely possible to solve this with a more elegant solution than the Rule Engine pipeline.
That is the key question for neural networks. You can read a couple of books about it, that is not an easy question. But in general the KNIME Analytics Platform offers the Parameter Optimization Loop to determine the best set of parameters for any learner. You can have a look at the EXAMPLE-workflow 04_Analytics/11_Optimization/03_Parameter_Optimization.

Best,
Ferry

francefabi · January 12, 2017, 5:02am

Hi ag47,

I'm new in knime and I'm trying to build a workflow using Rprop and MLP Predictor. Can you please help me with the "lag" thing? I know that using lagged values of the time series can help, but what can I do if I have 4 input variables for example?

Thank you for your help

FF

Aliasing · October 31, 2018, 2:26pm

Hi,

I used the “based on the last predictions”-Approach from ferry.abt with my onw time series because it was exactly what I was looking for. Most of the time series prediction models are autoregression models. These predict values which are already existing (in fact that an neural network always needs an input) and just compare them with the real value. This may be a prediction but is far away from really looking into the future (forecasting). As ferry.abt reasoned logically we need a prediction based on predicted values. However I found out some problems with that I want to share here:

biggest problem was that the predicted values began to run against a fixed barried just after 2-3 points in the future
I didn’t solved this problem with varying the hidden layer or number of neurons
sampling rate is definitly a big point here and also how much points in the past the training uses

I think for short term prediction this is a nice model. However I was not able to predict a big interval of thousand of points in the future. I think that must be possible in some way, because the model is trained for it and exactely knows which point comes next.

Greets