Using groupby to split columns

Hi there,

I have just started using KNIME and I came upon the following problem in my workflow:

I would like to group certain replicates in a column and make individual columns out of these groups.

For example:

Compound / fold change
A / 1.1
A / 1.3
A / 1.2
B / 1.7
B / 1.8
C / 0.8
C / 0.7
C / 0.3

should become something like this:

A / B / C
1.1 / 1.7 / 0.8
1.3 / 1.8 / 0.7
1.2 / – / 0.3

I tried using groupby node, but I can only aggregate the values and not sort them into new columns. How would you approach this task?

Thank you in advance.

Hi,

Please find a workable, loop-based flow attached below. If your tables aren’t too large this should perform OK.

Cheers,
E

Groups to cols.knwf (12.8 KB)

1 Like

Hi there @z_thiel,

welcome to KNIME Community!

You can try following as well. In GroupBy node choose List as Aggregation method. Then use Split Collection Column node to have values in separate columns. Using RowID node you can have you Compound names transferred to row ids and follow all that with Transpose node :wink:

GroupAndSplit

Br,
Ivan

3 Likes

That’s it! Thank you very much!

@Ergonomist: Your worklflow was also helpful on a reduced dataset, but I have around 10 million entries in my final dataset, so it might be a bit too slow.

3 Likes

Very elegant, thanks!

1 Like

Hi there,

glad it helped!

Br,
Ivan

I may risk looking like a smartass here, but why not use the Pivoting / Unpivoting nodes? They are the most underused nodes in the whole repository I think…

image

KNIME_project.knwf (8.1 KB)

5 Likes

Nice one @Aswin! :+1:
I think people don’t understand how to use it properly and get confused by configuration and 3 output ports after that. (Me included as I always have to play with it a bit before I use it :smiley:)
Br,
Ivan

1 Like

@ipazin I only ever use the first output :smile:

1 Like

@Aswin so ignoring other output ports does the trick :smiley:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.