Getting started with Python?

bassman · October 1, 2020, 10:58pm

What is the best way to get started with KNIME and Python? I found the “KNIME Python Integration” extension - and installed it - but the PDF documentation seems out-of-date: File/Preferences on my KNIME 4.2.2 does not look like what is documented: https://docs.knime.com/2018-12/python_installation_guide/index.html

I found an example that works, but it seems based on placing the scripts in nodes: https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_python_if_then

I have a fair amount of Python code that I’ve been running with PyCharm, but I’d like to use KNIME as a “front-end”, and combine my Python scripts/modules with other KNIME functionality.

The documentation is confusing me. I’d really like to leverage KNIME to boost my Python (and SQL Server) work, but I’m worried that KNIME Python integration may be a “work in progress”…

mlauber71 · October 1, 2020, 11:08pm

Well this was the idea behind the example to follow the KNIME pattern with nodes. But it would work just with a python snippet.

You could check out other examples with python that just use the node to hold the code and provide some flow variables.

https://hub.knime.com/search?q=kn_example_python&type=Workflow

Then there are these samples using KNIME as a wrapper for python (and R).

I don’t think it is. You would want to check out the latest documentation concerning python and KNIME

https://docs.knime.com/latest/python_installation_guide/index.html

beginner · October 2, 2020, 4:29am

The python integration requires you to put your Python core into the according Python node. Of course if you have your own modules and installed them into the according environment you can use them like any other module but you still need to write python code to call your modules. However making your own models makes sense if the code is complex enough to have all the proper tools available.

As a warning / information to some level yes, integration is work in progress. It works fine. The real issue currently is that all data is getting serialized back and forth from knime to python and vice-versa.This is slow and often takes longer than the actual data manipulation. So it should only be done if it can’t be done easily in KNIME itself. (loops are slow in knime and python (or R) snippet is the only place you can complex “whole table actions” eg pandas.apply stuff).
Work in progress because KNIME is planning to move all memory off-heap probably using arrow (like Nvidia rapids). That way no serialization is needed anymore. That will be a tremendous step and tremendous win (eg. work in progress as a good thing as it is only getting better and worth it to already invest)

mlauber71 · October 2, 2020, 5:41am

I agree. Sometimes it might be an option to use parquet or a local database to transfer data from KNIME to python and back.

Sometimes this could also help if there are problems with certain columns or formats.

docminus2 · October 2, 2020, 6:40am

Thanks for those examples, I will look into them.
I have otherwise written my temporary data as csv, called a phython shell externally and read the file output from python back into knime.
Same idea I guess.

christian.birkhold · October 2, 2020, 6:40am

Hello. How did you find this link? The latest version of this guide can be found here: KNIME Python Integration Guide

mlauber71 · October 2, 2020, 6:53am

Yes. The benefit of parquet or SQLite is that it would preserve the data types without effort.

bassman · October 2, 2020, 8:40am

Aha! That matches what I saw in the v4.2.2 KNIME I have. I will try the documentation at that URL: https://docs.knime.com/latest/python_installation_guide/index.html

I was looking at old documentation: thanks!!!

system · April 2, 2021, 8:40pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.