I was going through a beautiful webinar on “Integrated Deployment :- How to move data science into production” on KNIME TV available at youtube(https://www.youtube.com/watch?v=MJq154i4O-0).
In this Michael Berthold(presenter) use a workflow for demonstration which he mention that it is available on KNIME HUB (https://youtu.be/MJq154i4O-0?t=723) but somehow i am not able to find the same.
I am currently working on similar kind of problem (specially the monitoring and re-train of the model part and deploying the chosen model).The above mention workflow can be very useful to me.
can someone share the workflow? it will be very helpful.
P.S :- Sorry, i am not sure this forum is suitable for this kind of question, please do let me know . i will refrain myself for posting such question in future,if required.
Right now there are three different posts that go into detail of how everything works, with links to several example workflows.
With respect to the particular component highlighted in Michael’s video - the one about monitoring and re-training - I believe that may not be published yet. Let me double check internally and see what I can find out.
Hi @Wizard_dk,
at the moment we do not have an available Component for the monitoring part yet.
We are working on it but I cannot guarantee any deadline.
Sorry about that. If you use Integrated Deployment you should be able to re-execute pieces of workflows on demand. Something like:
Train your model on train set (generic Learner node)
Score model on test set (generic Predictor node)
Capture Scoring (generic Predictor node) with Integrated Deployment (node 1 and 2)
Deploy Scoring as REST API on KNIME Server via Integrated Deployment (Deploy node)
Capture from point 1 to point 4 (Learner node + Captured Predictor Node + Deploy node) with Integrated Deployment (node 1 and 2)
On a separate workflow query for new data for which you have ground truth (maybe from a frequently updated database with some timestamp column)
Check if performance is below a threshold you decide
Execute previously captured workflow (point 5) to retrain workflow and also redeploying it
The cool part is that with point 10 by calling this main workflow you are executing a chain reaction to repeat from 1 to 10 and the whole process restarts.
It might be a bit tricky to wrap your head around this but it should do the job.
I gave for granted you have access to a KNIME Server. If you don’t it should still work locally.
Just replace the Deploy node with a Workflow Writer node.
Let me know if you have any questions.
Thanks Paolo
I am following your Integrated Deployment Blog Series
I am doing the similar thing as suggested here like comparing model performance metric with defined threshold and plotting metric on line plot.
Things are smooth for now, will let you know if some help required.
Hi @paolotamag,
Good to hear that Model monitor is on its way !!
I was also able to implement something similar to your suggestion like scoring from captured score pipeline then do monitoring over it then showing graph for model performance over time … retraining the model on user wish from capture train pipeline and deploying new model (model selection) as per user wish
XAI looks cool. I have tried these things on python. Good to have them on KNIME. Will definitely going to use them in future!!
Hello there, we just published new Verified Components for monitoring of a classification model. It works both for binary and multiclass classification but the model needs to be captured within the production workflow via integrated deployment.
Until we can load the external deployment workflow as an integrated deployment connection you have to recapture the deployment workflow within the same workflow using the components. If you used the AutoML component that should be super easy. More information in the component descriptions and example workflow. A blog post should also out soon.