I’ve been using some of the components that have already been developed in Knime Hub into my workflow like Automated Machine Learning.
When looking into the inner workflows I find that some of the nodes that have been used dont relate to my dataset. But before I reinvent the wheel here for my workflow I wanted to ask if there is any explanation for why the workflow was built the way it was in like a blog post cause I feel like it would be great to understand why some components were used and the larger purpose it would serve.
Are there particular nodes in the component that you feel don’t apply to what you’re doing, or are confusing for some reason? Maybe with a bit more input, I ask can the component developer for his opinion.
Thanks for the quick response. I have the workflow that looks to see if a champion is existing for a dataset and if it does not then it creates the champion and if there is an existing champion then it creates a new challenger and compares the champion with the challenger. I was able to deploy to server but I am seeing errors when executing it. One place where I am having a problem is the dropdown for the column that is for target and the list of Models to run on the dataset are not prompted on the server workflow when executing it.
In order to include that prompt I was trying to decouple the quickforms selection form the automated machine learning component by creating a separate component for the prompt and then pass the web click as input to the remaining pipeline. I was able to create the selection list component with the respective variable names but am not able to pass that on further. I was looking to see if I could reverse engineer one of the model pipeline so I can mimic that and now my canvas is expanded to limits.
I am seeing that when I expand the workflows the variable values are flowing through the pipeline but they are not flowing through when I create components. The workflow is the exact same as in the Automated Machine Learning component , all I have done is taken out the web prompts but that is somehow causing an issue.
The point of this exercise is to be able to ultimately automate the challenger champion comparison and deploy on server for my POC. Any pointers here ?
Don’t use that workflow and learn how to do it properly yourself.
For example the component claims to do parameter optimization but in a very limited (1 or 2 parameters) way and wrongly (no Cross-validation). There is also 0 feature selection happening. This you will need to do yourself but you can’t without editing the workflow because you shouldn’t do it before splitting into test-train and so forth.
That component is not official yet. It is in fact published on my Personal Public Space rather than on the official KNIME Components repository. Components published by KNIME are verified and well documented. I apologize that we did not do this yet for the component you are using but it is definitely on my TODO list. I definately need to add the cross validation in the parameter optimization step.
We will post more content in the next year more about components to automate machine learning. For now I would use that component as a blueprint or a prototype that you can customize to your data and workflow as needed. If you have more precise questions on the inner mechanic of the component I will be happy to answer them.
@sramesh in order to create an entire autoML application that runs on your server I would suggest using this workflow instead called Guided Automation: https://kni.me/w/eAGfGtEAIr-1iYR-
This workflow is quite well documented. Scroll down under the workflow on the hub to find the links.
There we took care to create all the necessary views to setup the process from the WebPortal.
You could of course also create them again from scratch using the component of course.
To do that leave the component close without expanding it and create additional component with the interactive view of the WebPortal that feed flow variables with the settings in my component. If you double click on the component you can find a flow variable tab in its dialogue.
By the way there is also another component I made that you can check out to do AutoML. In this component also the selection of the best model is automated. If you want to see a workflow where this other component and more are used you can find it here.
In addition to what @paolotamag posted (Thanks Paolo!), I wanted to add a bit more about the issue you ran across with using flow variables and components together. This post should help shed some light on that: