Via this forum I found out how I can make use of a Flow Variable in combination with the GROUP BY node. The example workflow that I found allows me to ‘group by’ my dataset on only one ‘variable’ column.
If you have a fixed number of grouping column you can simply extend your Table Creator node prior to loop start with new columns containing grouping column names. See modified workflow: Groupby with flow var.knwf 1.knwf (14.0 KB)
One additional and hopefully last question.
In the setting of ‘your’ flow is see the following:
In your flow variables “0” and “1” appear in the folder grouByColumns->IncList.
Somehow I don’t manage to get that available in my workflow. What I get is this:
I included my workflow. Please find attached.
Many tanks for getting me going!
it depends on the preselection in the node settings. If you’ve select 1 group column you’ll get 1 field in the Flow variables, if you select 2 group columns you’ll get 2 fields, etc.
I hope in future this settings will become more dynamically.
Thanks you for your responses. Also after doing what you said, I stll don’t get the multiple selection possibilities in the Flow Variables. Please see below.
Could you please advice? There must be something that I am not doing correctly…
The strange thing is that when I load the example workflow from @ipazin this feature is working. But when I use it for my own purposes/on my own dataset, the functionality gets lost. Also when I build it from scratch in exactly the same way. Very strange.
I received an example workflow from @ipazin in which he uses a GROUP BY node in combination with Flow Variables. In the tab ‘Flow Variables’ the number of possible columns to select (the variables) moves along with the total columns you drag into the “Groups” tab within the GROUP BY node. This would be normal.
However, when I do the same in my Knime workflow, the number of columns to select as a variabel under “IncList” does not change. So I cant use multiple columns as variable.
I am using version 4.1.0. Do you perhaps have any idea what is going wrong here?
Many thanks for reading and for your reaction in advance!
regarding Flow Variables tab. The setting you have corresponds to KNIME version 4.1 and higher. Workflow I sent you is one I downloaded from your original post and modified. You said you found that example so my guess would be that this workflow was build prior to KNIME 4.1. Why 4.1? Because with this version new flow variables types were introduced including collection type that is now available in GroupBy node both for included and exclude columns. This actually makes easier to dynamically control GroupBy node. Unfortunately this is not the case when using loop as Table Row to Variable Loop Start doesn’t yet support new flow variables types.
Now to workaround which isn’t too bad. Using Chunk Loop Start you take row by row and create flow variable of Collection type inside loop using Table Row to Variable node.
in each iteration after GroupBy node you’ll end up with different table cause you have different grouping columns so you need to check Allow table changing specifications option in Loop End node.