How to Create a Single Update Site for Multiple Pure-Python Extensions?

Hi KNIMErs!

When I currently bundle my pure-Python extension for KAP 5.1 and add them as a local update site, this is what I see in in the Install KNIME Extensions... dialog:

KNIME & Extensions
|_ ...
KNIME Big Data Extensions
|_ ...
...
[My extension]
|_ [My extension]
...
More extensions

As you can see in the excerpt of the extension tree above, my extension automatically appears in a separate section (Great!) and has the same name as the section that it is grouped under (Not so great!). Their shared name seems to be taken from the line description of the knime.yml file.

Now, what I am after is rather something like the following to better accommodate future extensions of my own:

KNIME & Extensions
|_ ...
KNIME Big Data Extensions
|_ ...
...
[My organization]
|_ [My extension 1]
|_ [My extension 2]
...
More extensions

Here go my two questions:

  1. How can I bundle multiple pure-Python extensions into a single update site?
  2. How can I define where they appear in the extension tree?

Cheers!

Maybe @gab1one has an idea how to put them together.

Hi @DerMaxdorfer,

You want to create a composite update site, we have an example here:

In your case, you need to define the update sites created by the individual extension builds as repositories in the pom.xml file and then arrange them using the category.xml file.

best,
Gabriel

Thanks for the example, @gab1one!

I understand your .xml files, but am not very clear on where to include them and how to merge multiple extensions into a single update site.

My bundled pure-Python extension has this folder structure:

[my extension]
|_ features
   |_ [group_id].features.[name]_1.0.0.202309150750.jar
|_ plugins
   |_ [my extension].channel.bin.linux.x86_64_1.0.0.202309150750.jar
   |_ ...
|_ artifacts.jar
|_ artifacts.xml.xz
   |_ artifacts.xml
|_ content.jar
|_ content.xml.xz
   |_ content.xml
|_ p2.index

What should the combined folder structure look like?

Option 1

Combine extensions as-is in a common parent folder:

[my composite update site]
|_ [my extension 1]
   |_ ...
|_ [my extension 2]
   |_ ...
|_ category.xml
|_ pom.xml

Option 2

Copy features and plugins into joint folders:

[my composite update site]
|_ features
   |_ [group_id].features.[name 1]_1.0.0.202309150750.jar
   |_ [group_id].features.[name 2]_1.0.0.202309150750.jar
|_ plugins
   |_ [my extension 1].channel.bin.linux.x86_64_1.0.0.202309150750.jar
   |_ [my extension 2].channel.bin.linux.x86_64_1.0.0.202309150750.jar
   |_ ...
|_ artifacts.jar
|_ artifacts.xml.xz
   |_ artifacts.xml
|_ content.jar
|_ content.xml.xz
   |_ content.xml
|_ p2.index
|_ category.xml
|_ pom.xml

If it is option 2, I assume that the files artifacts.jar, artifacts.xml, content.jar, content.xml, p2.index, which are generated automatically during bundling, require some modifications too. How should I go about this?

Hi @DerMaxdorfer,

The idea with the provided example is as follows, you create your custom extensions as before, lets call them extension1 and extension2.
For distribution, you now create a third project (composite), based on the linked example, you configure it so that it knows about extension1 and extension2:

Now you can create the composite update site by running mvn package in the composite project. This will pull the features and put them into a joint update site.

I see what you are getting at, @gab1one. The Java world really is not my home turf, so I have got a few more questions if you do not mind. :grimacing:

  1. In the properties node of pom.xml, you specify a version for KNIME and Tycho. Your example is for KAP 5.1 and the Tycho version that it uses. I assume that this will change in future KAP releases. Is there a way to see for myself which Tycho my KAP requires?

  2. Where do I create the composite project in and run the command mvn package from? Do I need to install any additional software on my Windows machine, or can I simply point the knime-extension-bundling conda package to the location of my category.xml and pom.xml?

  3. Will the joint update site contain full copies of the features of extension1 and extension2 or only references to them? Put differently, will it be enough if I give my end users access to the joint update site only, or do they need to be able to access the original extensions as well because the “pulling in” happens every time when their installation is triggered?

Hi @DerMaxdorfer,

  1. The Tycho version is not so important, tycho is a maven plugin to build eclipse based software with maven. In this case, we only use the update site generation portion of it, which should work for the foreseeable future.

  2. You need to have maven + java installed on your computer to run this. You can create the project anywhere you want, it is not related to the knime-extension-bundling conda package.

  3. Yes, it will contain everything needed to install your extensions.

best,
Gabriel

Thanks, @gab1one! I will try it out once I have gotten Java and Maven installed on my company’s laptop. This will take some time unfortunately. I will let you know here if it worked for me.

Hi @gab1one!

While still waiting on the installation, one thing has kept me thinking:

In your pom.xml, you specify the download URLs of your update sites’ repos. These URLs could easily be replaced with the one of a public GitHub resource, right?

Unfortunately, I am not allowed to have public repos in my corporate environment, but I really do not want to manually download my extensions to my local machine before I can bundle them in a composite update site. I therefore created this one-liner that can be executed from the Windows command line:

FOR /F "usebackq tokens=*" %A in (`curl -L -H "Accept: application/vnd.github+json" -H "Authorization: Bearer <TOKEN>" https://api.github.com/repos/<OWNER>/<REPO>/releases/latest | jq '.assets[] | select(.name == "<EXTENSION NAME>.zip").id'`) DO curl -OJL -H "Accept: application/octet-stream" -H "Authorization: Bearer <TOKEN>" https://api.github.com/repos/<OWNER>/<REPO>/releases/assets/%A

Here are a few more details about it:

  • My extension’s update site lives in a non-public GitHub repo as a release asset.
  • The command consists of two cURL requests against the GitHub API: the first one to get its asset ID from the latest release, the second one to download said asset to the current directory.
  • The first cURL call requires jq to be installed so that the returned JSON can be filtered for the asset ID by the asset name.
  • The FOR DO loop is necessary to squeeze it all into a one-liner.

My question is the following:
How can I run this command from within the pom.xml? I read about the exec-maven-plugin, but this seems to go into the <plugins> section, i.e., I cannot simply substitute the <url> tag of each <repository> with it.

If it is even possible, could you please extend your example to accommodate this use case too?

If you recommend a different approach, feel free to share it!