I’d like to report a critical issue I’ve encountered in KNIME Analytics Platform 5.4.4. I have been running the same workflow unchanged for some time, and after upgrading from 5.4.3 to 5.4.4, I observed that it now produces a different output—despite no modifications to the workflow or its settings.
@umutcankurt you mean the deprecated joiner no longer adds a duplicate column to the result? Maybe you can build a workflow that can demonstrate the behaviour.
thanks so much for reporting! We take that sort of problem quite seriously!
I glanced over the changelog, but nothing obviously stands out to me that could cause this.
I understand that sharing the workflow might not be possible and building a smaller example that you can share too time consuming, so I would be glad if you could answer a few questions regarding the workflow.
Is the “join back” Joiner the first and only node that contains different output data? It looks like it has lost the duplicate column and gained a lot more output rows.
How does its configuration look?
Can you show the input tables to the Joiner? I suspect one of the tables might be empty, so there is no duplicate column anymore.
To provide more detail, please see the points below:
The section of the workflow I shared is a standard component that I have used unchanged in hundreds of workflows for at least four years.
I tested on Ubuntu across two different servers: both ran KNIME 5.4.3 without any issues.
On the server running 5.4.4, however:
a. I first upgraded from 5.4.3 to 5.4.4 and immediately encountered the error.
b. I then re-installed 5.4.3 (fresh download from the KNIME site, without applying any updates) on the same machine, and the workflow executed perfectly.
I have run the exact same workflow—same data, same configuration—on both versions side by side: the problem only occurs under 5.4.4.
It is not related to data volume or row count, since all of my active workflows on 5.4.3 process identical datasets flawlessly.
It seems I wrote my message confusingly. I don’t doubt that you see a different behavior when changing only the AP version (on the exact same workflow and data).
I wanted to reproduce the problem locally in order to debug it. For that, I would ideally need a reproduction workflow (as mlauber71 wrote) or by recreating it with information through my two questions:
Is the “join back” Joiner the first and only node that contains different output data? It looks like it has lost the duplicate column and gained a lot more output rows. How does its configuration look?
Can you show the input tables to the Joiner? I suspect one of the tables might be empty, so there is no duplicate column anymore.
Answering these questions would help me to create a workflow on my end to reproduce the problem.
unfortunately, with only the screenshot I am not able to reproduce the problem in AP 5.4.4. The Joiner (deprecated) node is still working as in 5.4.3 when I load the exact same workflow with the same input data (some test data based on your two Joiner outputs from the screenshot). It is not obvious what the upstream input data looks like and how the upstream nodes could affect the Joiner (deprecated) inputs.
I can also offer a short call where you could show me the affected portion of the workflow in order to find out what may be causing this problem.
It turns out a bugfix of the “twin list” element used in the dialog (internal reference UIEXT-2375) of the Column Filter (see screenshot) also fixed a long-standing bug: the “Enforce exclusion” parameter was not honored if a column was present in both exclusion and inclusion lists. The effect was that the column was output and not excluded.
In your instance, your Column Filter (selected in screenshot) has a list of excluded columns, including “Europe”, and “Enforce exclusion” set and also the inclusion list is overwritten via flow variable, called var, to contain the “Europe” column (in your original screenshot it was “Africa”). It’s less obvious what’s going on, because “Enforce exclusion”/“Enforce inclusion” were replaced by the “meta column” “Any unknown columns” and your node instance is very old.
In AP <=5.4.3, “Enforce exclusion” is ignored for the excluded and included column “Europe”.
In AP >= 5.4.4, “Enforce exclusion” is honored even if the same column is in both lists.
We will add a backwards-compatibility mode for existing nodes (internal reference UIEXT-2761) to restore the buggy, but expected “legacy” behavior. New nodes using this dialog element will exhibit the fixed behavior.
In the meantime, you could use one of two workarounds:
make sure that the included column does not appear in the “Excludes” list
reconfigure the Column Filter by putting “Any unknown columns” into the exclusion list and overwriting the “included_names” via flow variable var again: