Export Knime ML to Python Pickle

Hello,

I am new to Knime. I created a ML workflow in Knime. My next step is to export the learning algorithm to a Python pickle file. I have read through all the documentation, but I cannot find the steps to export my Knime ML node to a Python pickle file.

Is it possible to create a Python pickle file from a Knime ML node?

Thanks,

Alex

@TE499OP if you use the knime python environment you can store the model as pickle.

Other than that you could use PMML or I very much like H2O.ai MOJO format to easily transfer models between KNIME, Python, R and Big Data systems:

Then you might be able to try and use knime weka models in python, although I have never done that.

2 Likes

I created the ML model using a Knime Random Forest node. Would I need to build a new knime workflow using a Knime Python node? If so, is there any way to import the Random Forest settings from my original workflow?

I don’t think there is. You will have to find a format that can be used in Python. You can use knime python nodes to just run your favourite python ML library and use knime as an interface - or you can build models in knime that support pmml, mojo or maybe weka formats.

There might also be some specialised deep learning formats (think ONNX) that might be interchangeable between knime and python.

If you just want random forest I would suggest the H2O implementation.

1 Like

I appreciate your help.

My goal is to create a regression model and pickle file. I have the data set from my original knime workflow. Using the original data set, I am trying to create a new model, using either pmml, mojo or maybe weka formats. I tried to use the H2O implementation, but I cannot find a way to connect my SQL data source to H2O.

Any ideas?

Thanks,

You can download your data into knime and then do the model building there or you might use a system like cloudera big data cluster that would support sparkling water environment from H2O which is supported by knime.

@mlauber71,

off-topic, and out of curiosity, could you please tell me why you’ll prefer to use H2O RF over python or Knime basic node ?

Many thanks.
Regards,
Samir

1 Like

The use of H2O (especially the automl functions) with R/RStudio or Python (or encapsulated in a KNIME node) would give you more output like

  • details about the choosen model
  • variable importance
  • advanced configuration settings thru code
  • you could use hyper parameter grid search for GBM or other model types in code

The good news is: all options do work with KNIME and thanks to the MOJO format you can interchange the results between KNIME, R&Python and BIgData.

3 Likes

Many thanks @mlauber71,

Thank you for your guidance.

Regards