AutoML (automatic machine learning) functionality?

I am a user of R and of RapidMiner who is considering learning KNIME. I would like to know if KNIME has AutoML (automatic machine learning) functionality. R has various such packages, including h2o.automl through the H20 library, among many others. One of the most attractive features of RapidMiner is its GUI “Auto Model”, a sophisticated wizard-type interface that helps guide the selection of features and then automatically optimizes various advanced models for regression, classification or clustering tasks. However, RapidMiner’s Auto Model is only available in its commercial versions. (It also has a Turbo Model module for assisted data preparation.)

My question is: does KNIME have any similar functionality? I am mostly interested in KNIME for its pure open source licence (unlike RapidMiner’s “delayed source” model), so I would mostly be interested in non-commercial KNIME functionality. However, I would appreciate knowing if any commercial KNIME extensions provide such functionality.

I am aware that KNIME connects to open source platforms like R, but I am asking if there is anything that is included in KNIME’s fully graphical user interface. (My interest in KNIME is not just for myself; I am also interested in training associates for whom R might be too much to handle.)

2 Likes

Hi Tripartio

I am happy you are reaching out here in the forum looking for automated machine learning with KNIME.

I would like to know if KNIME has AutoML (automatic machine learning) functionality.

Yes it does

does KNIME have any similar functionality?

We took quite a different approach from a built-in UI. The results are anyway the same.

What KNIME offers is a couple of blueprint workflows you can download, use and customize.
The main workflow can be found here:

Guided Automation: https://hub.knime.com/knime/workflows/*eAGfGtEAIr-1iYR-
but to download it go for the entire folder from the example server:
50_Applications/36_Guided_Analytics_for_ML_Automation

In this workflow you will see Wrapped Metanodes. When you right click and open the view of those nodes you will have in the free KNIME Analytics Platform a browser window popping up. You can interact with this web pages to set up the automated ML process (e.g. interactively selecting the target column or filtering out columns). Just remember to “Apply” and “Close” the settings in the bottom right corner to save and push them to the next nodes in the workflow.

If you want to have more info about this topic, you can find all the links in the following thread!

Cheers
Paolo

6 Likes

That’s great. I watched the webinar–this is indeed exactly what I was looking for. I’ll be digging more into it.

2 Likes

Hello there,
this week a new Verified Component just came out on AutoML.

Check it out here:

An example workflow here:

Blog posts on the topic will be published in the Integrated Deployment Blog Series soon:

https://www.knime.com/integrated-deployment

Cheers
Paolo

6 Likes

Hi @paolotamag This is a very interesting node. I wonder if there is a way to use it from precompiled train/test/validate sets. For example in the cheminformatics space it would be useful to split data by scaffold or time to get a better picture of expected performance. Is there any functionality in the node to do something like this?

Also, just a beginner question on the node. Does it only show the top 4 models? I had them all checked but only got 4 model outputs in the view to compare.

Thanks so much!
Jason

2 Likes

Hi @j_ochoada,
the only way for custom test and train split is to edit the component.
That is right click on it then: Component > Disconnect Link
After that (on the Component) Ctrl + Double Left Click or Right Click then: Component > Open.
Once you are inside of the Component you can find the Partitioning node inside of AutoML DataPrep Nested Component and edit the splitting criteria as necessary. You could use a categorical column provided outside of the whole AutoML Component which flags each row to which partition they should go into and afterwards discard such column. No need for coding, simply add two nodes replacing a few others.

The AutoML Component trains all the requested models. If for whatever reason the trained model is always predicting one class or it simply failed to train such model will be shown in a flow variable at the output of the component and not listed in the view. The output of the Component only shows the best model.

2 Likes

Blog post here:

https://www.knime.com/blog/integrated-deployment-blog-series-episode-3-automated-machine-learning

4 Likes

Thank you! The design and color threw me off and I didn’t realize it could be treated as a standard metanode!

Thanks again for your assistance!
Jason

2 Likes

Great! A more detailed blog on the difference between metanode and component here:

https://www.knime.com/blog/metanode-or-component

Cheers
Paolo

1 Like

Thanks! I went searching because when I made changes I couldn’t figure out how to pass new variables into the components. learned a lot of good stuff! Can my changes be submitted to the Hub for others to use? Can normal users contribute?

Thanks,
Jason

2 Likes

Hello @j_ochoada,

every user is able to upload workflows to the KNIME Hub. That’s the idea :wink:

Br.
Ivan

2 Likes

have you checked out the new XAI View?

ezgif-3-43cacf89a4ba

Workflows:

  1. https://kni.me/w/5xfwkuVsF6Uz8hMC
  2. https://kni.me/w/JZbuUdhGKZBEpdoK
1 Like

@paolotamag I have not! Looks cool and ant to check it out.

I am currently struggling with AutoML in 4.2.2 on linux. The scorer doesn’t seem to be rendering it’s interactive view. Also Naive Bayes model seems to be broke also… :frowning:

This is the only output with the java view and I am just getting a gray screen. It was good in 4.2.1 before I updated.

Starting ChromeDriver 83.0.4103.39 (ccbf011cb2d2b19b506d844400483861342c20cd-refs/branch-heads/4103@{#416}) on port 26507
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
Oct 02, 2020 4:51:23 PM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C

Has anyone else seen this?

Hi @j_ochoada, we are currently investigating this! Thanks for the heads up! I will reply in here once I have any news!

1 Like

We are moving the conversation on the blank view in this other thread:

1 Like