Error when importing KNIME's Model file in Jupyter

JaeHwanChoi · July 17, 2023, 5:12am

Hello KNIME support and users.

We are currently working on a KNIME project with another company and we are facing the following issue.

I would like to inquire about the pkl model interaction between KNIME and Jupyter notebook.

I exported the ML model information created from the Pickled object port of KNIME’s Python Script through the “Model Writer” node.

After that, I tried to load the file with model extension with pickle.load in Jupyter notebook, but it failed.

Conversely, when I tried to load the file with model extension created by pickle.dump in Jupyter notebook through “Model reader” node in KNIME, I got the following error message.

I get an error message like Execute failed: File does not specify a valid port object model.

It is possible to export a model created in KNIME and import the model. It is also possible to pickle.dump & load a model created in Python.

However, it does not seem to be possible to import a model created in KNIME into Jupyter, or to import a model created in Jupyter into KNIME.

To summarize the problem

How can I read the file exported from KNIME to “Model Writer” node in Jupyter notebook?
Is there a way to import files with model extension created in Jupyter notebook via “Model Reader” in KNIME?

However, “Python Object Reader” Node is not available in the current project due to lagacy.

Is there any way to solve the above issues?
Your help will be greatly appreciated.

gab1one · July 17, 2023, 6:52am

Hi @JaeHwanChoi,

The model writer writes objects in a KNIME specific data format meant to be read by Model Reader nodes. The python object reader / writer nodes were created to allow writing / reading of data in a format that can be understood by external python processes.

Take a look at this workflow for a workaround if you can’t use these nodes yet:

Python + KNIME expert @mlauber71 explains there how to read + write data inside python scripting nodes.

best,
Gabriel

JaeHwanChoi · July 17, 2023, 7:37am

Thank you for your response.

So you’re saying that the “Model Writer” node is a KNIME-specific format that can’t interact with Jupyter?

If so, I can conclude that models exported to Model Writer cannot be read by Jupyter, and vice versa, models created in Jupyter cannot be read by Model Reader.

I’ve read your example, but it seems to be a different case from my current problem. Is there any other documentation or workaround?

Best,
Choi

mlauber71 · July 17, 2023, 8:05am

@JaeHwanChoi the model writer will zip a given model to store it. Underneath there has to be a model that the package that you used can read(KNIME ML to Python - #2 by mlauber71 ). If you want to use it with Jupyter notebooks this has to work in both environments.

You could try and store a model just as pickel and use knime as interface for some python models. Here are examples from several versions of knime and python:

There are other frameworks you could use. Knime started to generically support Sklearn for certain models:

If you want to interact between knime and python maybe the H2O.ai framework is for you. You can interchange models via MOJO files.

You might derive further inspiration from this collection:

JaeHwanChoi · July 17, 2023, 8:25am

Are the nodes that say legacy going to be deprecated soon?

I need to use them in an official KNIME project, will the nodes disappear or stop working if the KNIME version is upgraded?

mlauber71 · July 17, 2023, 9:05am

Legacy nodes will continue to work for an indefinite time. They will no longer be promoted and will be removed from the active node repository after a while. But they will continue to function.

For legacy database nodes there is a migration tool.

JaeHwanChoi · July 17, 2023, 9:20am

Thank you for providing the above examples.

However, my question remains unanswered.

What I ultimately want to do is to save the information of an ML model generated by the Pickeld Object output port of the “Python Script” node in KNIME to a specific repository with the “Model Writer” node.

After that, I want to load the model extension in the repository in Jupyter and this is where I am getting the error.

I can’t use H2O Mojo, because the analysis must use only algorithms that I have written within the Python Script.

I really need to read the files derived by Model Writer in Jupyter, so I would like to find a way around this.

I apologize for the continued questions.

bwilhelm · July 17, 2023, 11:59am

Hi @JaeHwanChoi,

If you really want to read the output of the “Model Writer” node you need to find out the exact binary output format written by the node and re-implement the de-serialization in Python. You can look up the code of the “Model Writer” node here and the Python port object here.

However, I don’t think that this is a good approach. Instead, I recommend using a “Python Script” node as the model writer. Use a little script that pickles your model and directly writes it to a file on the disk.

JaeHwanChoi · July 18, 2023, 1:56am

Hi @bwilhelm,

So it looks like I have to follow a bad approach to interact with the “Model Writer” node.

I’m currently in the middle of a project, so it’s very difficult to look at the github you shared in detail.

That said, I don’t see a solution other than exporting to your local PC or a specific disk space as code within a Python Script.

This would work on a personal PC to save to a personal disk, but if the workflow is running on KNIME Server and the repository where the models are stored requires an account and password to access, such as MinIO, it is not feasible to apply it as code within Python Script. (MinIO has a Python package, so I need to test if it works when running on KNIME Server.)

I need to go through this similar process with Pyspark, which also seems very difficult.

I’d be grateful to hear if I’m wrong about any of the above.

JaeHwanChoi · July 18, 2023, 2:03am

Thank you for your response. @sethhollen !!

When you say KNIME Model Exporter node, do you mean the one provided by NodePit?

If the above is correct, it seems that PMML related nodes can only export files to Pmml that can be spit out to the port of KNIME’s ML related Learner nodes.

I have to deal with files exported through the Pickeld Object port of a Python Script, so this is not possible to convert to Pmml form.

I would be grateful if you can tell me if I am wrong about this.

mlauber71 · July 18, 2023, 3:31am

If the algorithms would be written in Python why would you not write the result as a pickle object (or any format your Python model would support)?

To have it in knime ‘style’ you could employ an individual node:

Maybe you can provide a sample workflow or a screenshot of your task so we better understand the challenge. What part will be in knime and what part in Python. If you use knime server you will have to make sure the Python on the server contains the same packages and setup as the desktop version.

If you want to use PMML that is also an option but you will be limited to models that would support this format.

Then just as a remark. Knime can work with Jupyter notebooks directly.

JaeHwanChoi · July 18, 2023, 4:37am

The reason I didn’t export the results as a Pickle within the Python script is because I wanted to connect with a Generic S3 Connetor to export the files to my client’s MinIO.

In order to put the files into the customer’s MinIO, we need keys such as account information, so we used the “Model Writer” which can access the MinIO using the connector.

As an alternative, we plan to test if we can access MinIO with KNIME Python Script via Python’s MinIO patch.

mlauber71 · July 18, 2023, 5:47am

@JaeHwanChoi one option could be to write the pickle file and then use KNIME transfer file nodes to upload/transfer the file

system · October 22, 2023, 10:35pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.