SARIMA Predictor (Labs) - DATA-INPUT-PORT missing?

Hi there.

I have a question concerning the “SARIMA Predictor- NODE”.

I created a model in knime using the node “SARIMA Learner”. Now I would like to apply it using the “SARIMA Predictor-NODE”:

SARIMA Predictor – KNIME Community Hub

But I am wondering why there is no input-port for the new data? Did I get something wrong? Where can I bring in the new data on which I would like to apply the model on?

Similar Nodes like the “Random Forest Predictor” do have 2 Input-Ports, one for the model, one for the new data. So whys the difference here, how it is supposed to be applied?

Greetings
Jakob

Not my area of expertise, but the SARIMA Predictor uses the model created by the Learner and projects it forward the selected number of periods. This works differently than a numeric model.


3 Likes

Have not used this one myself, but what I always do is inspecting other WFs available on the hub that use the new node.

You can find them if you go back to the link you posted and scroll a fair bit down:

You can open any of the examples - I typically look for any that were uploaded by Knime as they typically are easy to follow.

2 Likes

Hi @rfeigel and @MartinDDDD ,

thank you for your answers and your hints. Unfortunately I still have no clue how to solve this problem.

This is a screenshot of my created Workflow with the model which is working fine:

image

Since I want to apply it on new data, i stored/worte it into my personal folders with the Model writer.

Then I created a new workflow and read the model into it:

image

As you can see, there is no possibility to apply new data to the created model.

There is something I’m getting wrong since there is no Input-Port for new data in the SARIMA-Predictor. Here s a screenshot of Corey Weisingers Workflow. As you can see, there is no input port for additional data, only for the model:

I would have expected an Input-Port like it is available in other predictor-nodes:

Create a model with your new data. That’s the way it works.

1 Like

I’m with rfeigel here.

The way I understand time series and SARIMA is that it picks up on the patterns / seasonality of your training data and then allows you to forecast x periods into the future.

So in comparison to decision trees you are not really applying a model to unseen data…

If you wanted to test how well SARIMA works / want to optimize parameters you may go ahead and train a model only on a portion of your data, then forecast until the end of your available data to then compare. Once happy we with the outcome you then train on all data available and then forecast for the “unknown” future.

3 Likes

Hi @rfeigel and @MartinDDDD,

thanks again for your support.

I already thought about creating a model with the new data as Plan B. I had the same idea, but just was curious about how it was supposed to be solved in an “professional” way.

Maybe this is already the way it is supposed to be, like you have written, “not applying a model to unseen data in SARIMA”.

And - crossvalidation is included in the model of course.

Thank you very much for your help!

Greetings
Jakob

1 Like

Hello @JakobJosef,

I wrote the script behind the SARIMA (Labs) nodes. SARIMA models are more of the type of Auto-regressive models. It learns from a regression data, computes the coefficients, and the final equation is use to create “Forecasts”.

As part of the feedback that we (Me & @Corey) got from some of the industry experts was to apply the coefficients on a new set of data, however, this method is least preferred, considering the fact that the dynamics of new data are highly likely to be not the same as of the data for which the coefficients were computed.

As it is mentioned above, the only workaround is to train a new model with the new data. Please feel free to provide your ideas on improving the nodes from the new time series extension, by posting on the forum and we will incorporate it in the future releases.

Thanks,

Best,
Ali

4 Likes

Hi @aliasghar_marvi,
wow I’m so excited and honored again to get feedback and help from the experts/architects of the knime-nodes.

Thank you very much for your explanation. I already went on with the “workaround” (which in fact isn’t one) and results are good (and way better than the solution/results we had before). I was just curious about how it is supposed to be solved but I’m happy now since I know that this “workaround” is the right way.

I’m really excited about KNIME and the support one can get in this forum, thank you very much <3.

Of course I’ll try the new time series extension nodes.

Greetings
Jakob

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.