I have built a time series prediction model and I am currently struggling to apply the model and predict future values. By using linear regression, the model predicts the volume of tweets in a specific region. What I need is a volume prediction for a specific time period (e.g. 30 days).
Can you please provide support?
Welcome to the KNIME Forum! How does your training data look like and what exactly do you want to achieve? Do you want to predict the number of tweets occurring in the 30 days, or the number of tweets 30 days in the future? In either case, do you have all the features for this time frame?
thank you for your reply. I want to predict the future starting from the last point of my test set. The performance of the model is ok, so what I need is to apply the model and forecast the number of tweets in the future.
Here is a snapshot of my training data.
it seems like you want to do rolling predictions, i.e. predict the future value based on the previous values. For this you need to use the Recursive Loop Start and Recursive Loop End nodes. I have created a simple example workflow:
In the first iteration you pass in the slice of training data with the last n number of rows you need for lagging. In the loop you have to append another row with the next timestamp and a missing value as target. Then you can use the lagging node, make a prediction for your target in the last row and then append that to the table that came into the loop. In the recursive loop end you pass back the table with the newly appended predicted row and do the same thing again. By setting the correct number of max iterations, you can control how many days into the future you want to predict.
I have the same work problem as MiladH, but I think I’m encountering different errors/warnings.
I’m new to Knime (~1 month) and still learning the ropes. I have a dataset that includes electricity consumption (daily minimum kW and maximum kW values) as well as weather data (such as cooling degree days) for the last 2.5 years. I’m interested in predicting min kW (and max kW, in a separate analysis) using it’s own lagged values, as well as lagged max kW values, and current and lagged cooling degree days. Using Linear Regression Learner, I want to train my model, see how well it predicts my hold out sample, and I would like to use my model to forecast 2 years out of sample.
I’m following your Rolling Time Series Prediction workflow as much as possible (https://kni.me/w/nbPazdSVY_eVPAOe). I’ve sectioned off the last 30 rows in my Row Filter
I’ve downloaded Anaconda3, and updated Python settings within Knime.
Are the errors that I’m getting because I’m creating my lags before my Recursive Loop Start?
Thanks for any and all help!
The error shown in the console does not really match the screenshot of your workflow. There seems to be no Python node in the workflow and the Regression Predictor is failing, but the message in the console comes from a Python Learner node. You need to install the package “statsmodels” into your anaconda environment “py35_knime”. I think you can use the following commands in the Anaconda console (unfortunately I have no Windows system to test it):
conda install -n py35_knime -c anaconda statsmodels
Judging by your screenshot, you have several different subworkflows going on (as shown in the outline pane) and I suspect your error messages are being generated by one of those.
If Alexander’s suggestion doesn’t help, could you come up with a minimal example workflow, using the nodes of interest for this particular error only, and upload the workflow itself (instead of a screenshot)?
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.