In Python, you can export the trained model via PKL file during machine learning, but in KNIME, when using the Analyze node, you can export the model information to the blue or gray port.
However, some models can only be exported to PMML Writer and some models can only be exported to Model Writer. Is there a common node that can be exported?
Additionally, I want to export the model trained with ML node to PMML & Model Writer, and then import the file into my personal jupyter notebook to use it for prediction, can I use the model created in KNIME only in KNIME?
A lengthy discussion about the exchange of models between knime and python can be found here:
From my experience the best model format to interchange between knime, python, R and even Spark/BigData systems is H2O.ai MOJO:
KNIME has started to support Sklearn nodes. Though I have not tried to interchange their models with python (via pickle for example):
This seem to be the only change since we last had this discussion. The format of a model very much depends on the underlying system and packages. KNIME as a platform is trying to bring them together but I don’t think there is a universal solution or a format to cover everything. For deep learning there is ONNX but my experience here is limited.
Does this mean that the files exported by PMML Writer can be used standardly in other systems (R, Python), and the files exported by Model Writer cannot be used standardly because it is a KNIME-specific extension?
Also, some models can only be exported to PMML Writer and some models can be exported to Model Writer, but why is this distinction made?
@JaeHwanChoi the format and interoperability very much depends on the issuer of such model systems. PMML is supposed to be some sort of standard for a group of models though I found it might not always work on all platforms and does not cover some advanced models like XGBoost.
There is no such thing as a universal model format, especially not one that would work on all environments per operating systems. It will always depend on the environment used. KNIME is a platform to them and makes the use easier.
As I said. My best experience with interoperability is with H2O.ai MOJO format - you can use them seamlessly with KNIME, R and Python.
In this collection you will find several examples of interaction between knime and namely Python.
To send and receive models interoperably with R and Python, KNIME uses the h2o node to convert the trained model to “H2O Model to MOJO” and export it to “H2O MOJO Writer” and read the model using h2o-related code in Python?
I’m asking because I saw a related example WF, but it doesn’t have the process I want.
Even though I exported the model from KNIME to mojo, I can’t seem to access h2o on my personal PC, a Jupyter notebook, unless I have h2o installed. I have installed the package, but do I need to have the actual paid or demo version of h2o to use the h2o model in Jupyter?
@JaeHwanChoi you can use the free version. You will have to have the h2o package installed and need to start a h2o process in the background.
Here is a notebook (I am planning to write a blog about it):
You can import a mojo model from knime at any point. Or you can import a model you created in a jupyter notebook into knime. Or you just use the Python node to run model created in Python in knime and read the results.
If you want sample code in R there is an example in this workflow group
If I need to start the h2o process in the background, does that mean I need to get and run a free demo of h20 so I can run h2o-related Python code in Jupyter on my local PC?
In other words, what I want to do is to turn the model generated by h2o automl in KNIME into a mojo, export it to mojo writer, and then import the exported mojo via h2o package in my personal Jupyter for forecasting.
The current problem is that I exported the model from KNIME, but when I install the h2o package in my personal Jupyter, I get a code error in the h2o.init() code because I don’t have a free version of the actual h2o available.
A typical partner project wants to deploy KNIME-generated models to a specific storage, and then import those models into their own Python for forecasting, which means there is a constant demand for model compatibility between KNIME and Python.
To summarize based on the discussion so far, the only model that is compatible between KNIME and Python is h2o mojo, and models exported with “model writer” cannot use extensions that are only applicable in KNIME. In addition, the models that provide pmml are limited.
So, if you exported to h2o mojo, you need to subscribe to the actual h2o commercial version or demo version to use it in your personal jupyter to run h2o-related code.
Please confirm if the above summary is correct. @mlauber71!!
The sample workflow above shows the interaction between knime and several python based modeling packages including XGBoost and Lightgbm. I assume setting this up and working with it will give you additional insights.
When you install and run Konda, you’ll see a message like the one below. It says I can’t access the server, but don’t I need a license of commercial h2o (like the demo version) to access the h2o server after all?
“Deploy” as using the KNIME specific deployment nodes? Or “just” deploy as a regular workflow which uses those models on the hub?
A regular workflow could also use standard python nodes and then there would not be any compatibility issue I assume?
edit: Have tried h2o in python using colab before. It was free as @mlauber71 said.
With the bundled Python version one can use the sklearn package with the Knime Python nodes. My impression is the first task would be to make a plan what software and system should be used on which environment and also what skill levels will be needed by the people using it.