Sometimes I don’t like the way KNIME manipulates the settings of current nodes in accordance with how former nodes change its input data. I know this is how KNIME works and it’d be difficult to decide where to keep this behavior and where not.
I don’t like, for example, how it looks like when I open a configuration of a Row Filter node at the moment input data does not contain the column the Row Filter node was configured to filter by. It simply sets itself to another column. Pretty confusing.
Unlike this, I like the way the Column Filter behaves in this case. It simply marks missing columns red just as you can see here:
And for now, let’s consider a complex workflow full of logic full of nodes and components. And let’s consider you have a Joiner Node joining two tables of tens of columns. At the design time, we decided to filter output and we chose a number of columns to keep. So we un-checked the Always include all columns option and picked a column or two to include. Let’s have a look at a simplified example:
Let’s imagine we have our logic based on joining and filtering so we have lots of such joiners. At a certain moment, we could rename table columns. Capitalizing them is a good example. Btw it proved successfully to me to stick strictly on either lower or upper case among the whole workflow. Well, let’s change the case, and let’s see the Node’s configuration. This seems OK. It’s wrong but we know what to fix:
But this is a problem. We have no clue. The node lost all its Column Selection settings and we have no idea how to configure it again:
So my conclusion is never to filter columns in a Joiner node. Always include all columns and use Column Filter nodes for column filtering instead unless you could always re-create your workflow by heart.
- Never rename anything. Well, some people act this way. Keep away from them.
- Backup your workflows using VCS make backup copies of your workflow prior to any refactoring.