Hello, I’d like to use a recursive row loop process to forecast future daily sales. I’m including a workflow with 2017-2018 dates and random sales generated. I’d like to have a loop that creates 2019-01-01 thru 2019-12-31 using a five-part moving average process. Example: to generate 2019-01-01 sales data row, take the following prior sales rows and average them:
12/25/18
12/18/18
12/11/18
12/4/18
1/2/18
Can anyone build out this workflow quick from the attached starting data? Thanks!
I am not sure if I understood exactly what you want to do. I have looked into your workflow but I don’t get how you determine that 2019-01-01 should be generated by those rows:
Here is something you can try. There’s no looping involved, it’s just calculating an average based on the past 5 weeks of data. It’s probably not exactly what you want, but maybe it will help.
Thanks, Scott. This is what I started with as a first version for predicting future sales. For every single day of business, I took the past four weeks for that same day along with the day to predict from 52 weeks ago.
To use this five-week averaging method, I re-forecast 2018 dates using the averaging method and then did a date shift to move each date to a 2019 date. The problem with shifting rather than creating new rows is the earlier part of the 2018 year pulled down sales predictions for early 2019 too much. This is why I’d like to take the last four weeks of 2018 and generate 2019 rows forward rather than date shift.
Simon. Thanks for the replies. I replied to Scott in attempt to better explain the problem at hand. Looking to generate new 2019 sales forecast based on 2018 rather than date shift 2018 dates => 2019 dates which is the first version of what I’ve done.
Thanks for any input you can provide on creating new rows based on an average of prior rows (creating lag columns for each prior value to include in forecasting the future row.
I would like to point you to a few resources where they tried to predict future sales at a german retail chain on Kaggle. Maybe you might get some useful insights.
One thing to consider might be special things like (regional) holidays or events like easter that will not be at the same date every year, and the holidays might not only affect the day itself but the days before and after. You might want to take that into account.
Trying to predict future sales on Kaggle
Interview with winner
Impute Data with R’s Amelia
(I have an article about Amelia and time series imputation somewhere in the internal forum at the office, might look that one up)
I’d still like to see if anyone can solve this very rudimentary method for ‘forecasting’ prior diving into any R integrations in KNIME. Here’s an attempt to better explain the problem:
I’m looking to create a sales forecast by averaging together a few weeks of data.
Well, you asked for it. Might be a little bit over engineered … But I still wonder how to handle a loop if I have to reflect on something and ‘remember’ it - yes I can append something at the end of a loop, but sometimes my first idea is to store the intermediate results.
You should really check if you just want such a simple prediction, but anyway, here it comes.
I suggest for beginning to take a look at Moving Aggregation or Moving Average node which I used. It will give you only a “forecast” for next week but as well it will give you indication how good is it because you can compare it
I know it’s a simple forecast. What I’m actually doing is four recent weeks plus the forecast week or day last year. So an average of five weeks with a 7,14,21,28, and 364 lag. I plan to improve this forecasting method eventually but wanted to start with this as I’m also building a pipeline and basic BI reporting system at the same time.
I think both of your inputs are incredibly valuable and I look forward to playing around with them.