Knime & Docker Time Series for thounsands of SKU

Gabriel_Cornejo · February 20, 2019, 3:24pm

Hello my friends:

I have made a workflow which makes a Time Series Model with an optimized artificial neural network. The goal is to predict future SKU´s sales week choosing the best model from 465 different network architecturesof a ANN. If I use this workflow to predict 1 SKU it takes approximately 3-4 minutes and the main goal is to predict 35.000 SKU in less than 24 hours all of them, so I can not do it in my laptop.

I was chating with a friend and he told me that I need to use Docker (the whale) and then run all SKU on the cloud, the problem is that I do not have idea how to do it.

Someone can help me. I attach a sample workflow.

Thanks a lotProy_TS_Sample.knwf (70.1 KB)

ScottF · February 20, 2019, 7:50pm

I’m probably not understanding your problem correctly, but is there a way to do this without training an NN for each individual SKU?

If you were able to reduce the number of possible NN architectures down to a single model that performs reasonably well for all SKUs - or even a small group of possible NNs - then you could train those models once, export them to PMML, and then run predictions for your 35,000 SKUs in a separate deployment workflow (something similar to https://www.knime.com/nodeguide/applications/churn-prediction/deploying-the-churn-predictor).

Also, are you doing any dimensionality reduction on your data? That might decrease iterated model training time.

Gabriel_Cornejo · February 21, 2019, 2:44pm

Hello Scott:

Thanks for your answer. My client wants one model for SKU. It is not possible to reduce or clustering SKU some way. I was checking the differents ANN’s architectures to notice if some of them could be discarded, but the entire range appears. Unfortunately there is not a winner architecture that could be applied to all SKU´s.

Another way to reduce time was to create a powerful virtual machine, but it has to parallelize. So now I want to try with docker but I never did that before.

Gabriel.

ScottF · February 21, 2019, 4:41pm

Have you seen our blog post about the Monster Model Factory? Of particular relevance here is the section on Scaling the Monster: https://www.knime.com/blog/beauty-and-the-monster

This approach can be implemented using both AWS and Azure. We have some additional resources available to help you configure your cluster if you decide to go this route - check out the blog post and let me know.