Quick Questions about Python based KNIME Extension

Hi, KNIMEr, :coffee:

I tried creating a new Python based KNIME Extension for the first time today, and my initial impression is quite positive. However, I have some questions that I’m still confused about even after reading the documentation. I’m wondering if anyone can give me some hints:

  1. After I change my_extension.py, is there an easy way to reload the extension? Restarting KAP every time is quite painful.

  2. I noticed that the development of Python nodes depends on “KNIME Python Extension Development (Labs)”. The word (Labs) makes me a bit concerned that this is just experimental. Will there be significant breaking changes in the future? Does KNIME team have a development roadmap for Python extensions? If developers know this information, it would help them decide whether to jump into this development now (or wait and see).

  3. I noticed that conda_env_path needs to be configured in config.yml. I’m wondering, if I don’t have specific pkgs and just want to use the official default Bundled environment, how should I specify conda_env_path? This way, my extension only needs to rely on the official bundled environment and can be used cross-platform.

Thanks!

Hi @HaveF,

glad to hear you have a positive impression!

  1. See the documentation - no restart of KAP is required for:
  • Changes to the execute and configure runtime logic.
  • Changes to existing parameters e.g. changing the label argument.
  • However, other changes, such as adding a node or changing a node description, require a restart of the Analytics Platform to take effect.
  1. Pinging @carstenhaubold for additional thoughts, but I am confident there will be no significant breaking changes in the near future.
  2. Yes, that is possible! If you want to use the bundled environment, adjust your knime.yml as shown in lines 3, 9 and 10 here.

Let me know whether that helps!

Best
Steffen

2 Likes

Hi @HaveF,

Great that you’re building your own Python extension!

Regarding 2.) we should get rid of the “Labs” label. There are so many Python based extensions now that we will not change anything major with the API any more. We will add new features, but in a backwards compatible way. I’d say it is safe to go ahead – and I’ll make sure we drop the “Labs” label soon.

Best,
Carsten

2 Likes

@steffen_KNIME Thank you for the link. I think I indeed missed that part of the documentation you mentioned. As for the sklearn extension, haha, I forgot, this is a great example for learning Python extensions! Thank you very much!

@carstenhaubold Thank you for your addition, this confirmation is much more reassuring.
Additionally, I’d like to mention a problem I encountered during the installation process (just a guess, not very sure).
When I directly use
conda create -n my_python_env python=3.11 knime-python-base=5.4 knime-extension=5.4 -c knime -c conda-forge to install the corresponding dependencies, the speed is acceptable. However, if there’s already an environment, for example python 3.11, and there are many other packages in that environment, and then I use
conda install knime-python-base=5.4 knime-extension=5.4 -c knime -c conda-forge to install dependencies, it will be very, very slow. I suspect the main reason is knime-extension, because it only has tar packages related to py39, unlike knime-python-base which has tar and conda packages for various versions. This might cause conda to spend a very long time checking when installing knime-extension. Or, maybe the reason for the slowness may be that knime-extension does not have some installation package called xxxx.conda like knime-python-base, I don’t know.
Again, I want to emphasize that this is just a guess.

Btw, why these pkgs are not list/upload in pip website(Is there some reason we can only use conda? Like binaries for various platforms? )? Sometimes I just want to install packages quickly, ignoring some compatibility issues… If there is no special reason, I suggest adding another pip installation mode. And, pip is more easy to use in CI/CD than conda.

2 Likes

Thanks for the feedback!

Well, when you install a fresh environment conda only needs to resolve the dependencies (make sure there are no conflicts in dependency version ranges etc) of the packages in that command. If you already have an environment with many installed packages and add knime-python-base there are obviously more dependency to consider, making it slower.

You make a good point though, the knime-extension package obviously needs to be available for newer Python versions, too. I’ll take care of that ASAP.

About the knime-extension package: this only contains the API so that autocompletion works well when you build your extension. It is not needed for the extension itself to run.

The knime-python-base package on the other hand is a so called metapackage: it doesn’t have any content on its own, but lists all dependencies like Python, PyArrow, Pandas and Py4J that we need in the Python environment to be able to talk to KNIME. This is just a way for us to make sure the environment satisfies KNIME’s needs.

If you really want to set up your environment using pip, you can simply check the contents of the knime-python-base metapackage (e.g. conda search -c knime knime-python-base=5.4 --info) and install all those packages in your pip environment. However, when bundling extensions you need to provide a yaml for a conda environment because we use this when packaging the environment (The reason is that conda allows for installing C-based Python extensions in binary form, whereas Pip often resorts to compiling the extension, which would be pone to causing more issues for users who install the extension).

Best,
Carsten

1 Like

(Just double checked, the dependencies of the knime-extension package are “Python >= 3.9”, but the name is misleading as it contains py39)

1 Like

However, when bundling extensions you need to provide a yaml for a conda environment because we use this when packaging the environment (The reason is that conda allows for installing C-based Python extensions in binary form, whereas Pip often resorts to compiling the extension, which would be pone to causing more issues for users who install the extension).

Make sense. Thanks for detailed explanation! @carstenhaubold :beers:

Forget about the slowness, sometimes it is hard to know where the problem comes from…pkgs, deps, networking, etc. For now, I’m not sure if it is my network problem.

Thanks again!