Ungrouping multiple columns

#1

Dear knime users,

I have a hierarchical organized list, I like to create a workfow which allows me to switch between the hierarchical levels. The grouping works fine to me, but not the ungrouping.

Example:
ID1___ID2___ID3___ID4…
G1____M1___a_____Z
G1____M1___a_____Y
G1____M1___b_____X
G1____M2___b_____X

GroupBy ID2 - for Aggregation “Concatenate” and “;” as delimiter.
This yields:
ID1_________ID2___ID3____ID4
G1;G1;G1____M1___a;a;b__Z;Y;X

Alternatively I used the GroupBy with “Unique concatenate”, this yields
ID1___ID2___ID3____ID4
G1____M1___a;b____Z;Y;X

I guess with the “Unique concatenate” information for the ungrouping gets lost?

However, now I want to reverse the workfow to yield the initial format.
Since the ungrouping has to be done with many columns, the “Cell Splitter” “Ungroup” approach did not work for me.

Any suggestions?

Kind regards

Kiro

0 Likes

#2

Hi there Kiro,

Instead of Concatenate you can use List Aggregation and then you can use Ungroup node to revert to your original data format. If you need to use Concatenate then you will have to use Cell Splitter node for each column on which you performed Concatenate aggregation to create list column. After that you can use use Ungroup node.

Yes.

Try it out and if any question feel free to ask.

Br,
Ivan

1 Like

#3

Hi @ipazin,
thank you, that helped me!
Unfortunately, it causes trouble at later parts of the workflow - output to Excel.
The workflow I have at the moment is:

The output of the grouped list (middle excel sheet) now contains only the Group column (defined in the GroupBy Node).
Approaches to change the list format to string like "Split Collection Column” - “Column combiner” as described here:


do not work for me since I have many columns and I do not want to have additional columns.
Also, I want the data aggregated like in the list format, so that I can reverse the workflow (also when data are exported, altered and imported back to knime) but now I need a way to export the data. :wink:

So in summary; “concatanate” (GroupBy) works for me to reduce the list, but than reversing the workflow does not work and “List” (GroupBy) allows me to reverse the workflow but than I can not export the aggregated list.

Any ideas?

Kind regards and thanks a lot!

Kiro

0 Likes

#4

Hi there Kiro,

not quite sure I understand what you are trying to accomplish but why not separating Middle and Last Excel sheet branches? Add another GroupBy node that would go from File Reader to Reference Column Resorter (Node 1067). And also if result of GroupBy and Ungroup nodes yield same data as File Reader do yuo need them?

Br,
Ivan

0 Likes

#5

Hi @ipazin,

yes, sounds weird what I try to accomplish :wink:
The point is, that I have an software output (File reader) which I like to condense to make it more useful and human readable for using the data outside the software e.g. in an excel sheet.
However, if someone adds data into specific columns into the excel sheet, these additional data should be easily transferable back to the software, therefore I need a workflow which is reversable (like list aggregation via GroupBy) but also can be exported.
Hope it makes sense now :grimacing:

Kind regards and thank you for your support!

Kiro

0 Likes

#6

Hi there Kiro,

sry for delay on this topic. It got buried away :slight_smile:

Anyways have you managed to have some progress with it? To be honest still doesn’t make sense too me but that doesn’t mean it makes no sense for you to do it :smiley:

Can you try with some numbers like input data, output data, and what happens if something is added to Excel Sheet. That might help me understand and thus help.

Br,
Ivan

0 Likes

#7

Hey @ipazin,

thank you for remembering!
I had a solution, but I was not happy with it. Upon your request I started to re-think it and I found a solution, I can live with. But not sure if is the optimal or most elegant solution.

So here again my problem and how I solved it:
Maybe this time I will be able to explain what I am doing ;))

The Original List (Table 1) contains several levels of hierarchy (in the example H1-H3),
Each level contains several Values or Identifier Columns (in the example ID1-ID3).
The first level, H1, contains a very rough grouping, the second level, H2, is interesting for
the human user, the third level, H3, is important for the software.

Software output (Table 1)

Reduction of the software output to the second hierarchic level (Table 2)

This is the desired „human readable“ Output. Unfortunately the columns aggregated via GroupBy – List can not be exported.
However, since the H3 is not so important for the „human readable“ reference table, it is acceptable to not have the data from H3.
So this is the output I get, when I export Table 2 to excel:

Table 3
grafik

This Output might be manually edited e.g. replacing B to J in the “H2_ID2” column.
However, it seems as long as one column e.g. H2_ID1 remains unaltered, this output can be combined with Table 1 with the joiner node. By selecting in the joiner node which column shall be used from which source (all “List Columns” from Table 1, all other columns from Table 3), the new List is equal to Table 1 besides the manual edit and can be imported back to the software.

Kind regards

Kiro

0 Likes

#8

Hi Kiro,

tnx for detailed explanation. Think I got it and seems to me you did manage it well :wink:

What I can add is that you can take a look (if you haven’t already) at the reporting tool integrated within KNIME called BIRT. Maybe there you will find more possibilities if needed for these kind of reports.

Br,
Ivan

1 Like

#9

Hi @ipazin,

thank you for supervising it!

And thanks for the link, looks really cool!

Kind regards

Kiro

1 Like