I have been looking and trying for a while now but I cannot transform the data in the way I am aiming for.
Below is a visualisation of it.
I’d like to point out it would report duplicates (ie. C has two times 1). I’d be happy with “fillers” if columns cannot be of different size, as long as they can be identified and filtered downstream.
Any input, node suggestion or keyword to name this kind of operation would be appreciated.
This can be done using a -Pivoting- node:
Please let us know if you need any further help to configure the -Pivoting- node.
Thanks a lot Ael, it is exactly the operation I was after.
I plugged the node into my workflow and with a bit of work I got it right.
By any chance, do you know if there is a way to skip the grouping and maintain raw values?
In my case I am interested in looking at the spread directly in addition to SD or other metrics.
If not, I surely can craft a unique tag per record and use that to avoid grouping.
Glad it helped.
Sorry I didn’t understand what you mean by that. Could you please show us the final table you would like to obtain please ?
You can add a “helper column” e.g with the rank node and group on this.
hope that helps
It might be my configuration of the Pivoting node and I need more experience with it.
My understanding of it so far is that eventually it will process values similarly to the "Group By node.
In my case keeping the raw values rather than descriptive stats about them (mean/median + SD/MAD) would be helpful.
ie. For A the pivoted table would report “Grouped value” of A instead of 1 - 2 - 3.
Daniel just below (thanks!) introduced a Rank node which I believe will provide that extra variable to keep all values separated.
There will be a single value for each rank in every column, therefore grouping by Column and Ranking would preserve raw values.
Apologies if my explanation is not crystal clear, I find the overall transformation I am going for a bit peculiar and can’t put precise words on it.
Though the Pivot node is the key, with some workarounds it got me where I needed - thank you!
Thanks Daniel! I did not know this Rank node, it sure is very handy.
Would you mind sharing the configuration you use for the pivoting node?
I am trying to replicate the result but I am struggling to visualise the impact of: groups - pivot - aggregation.
No problem at all @C-Rey and thanks a lot for clarifying what you wanted to add as information. Indeed, the configuration -Pivoting- node is not obvious. The -Groupby- node is much more intuitive and every time I have the choice among using the -Pivoting- or the -Grouping- nodes, definitely I use the latter.
Please get back in touch if you need further information. Glad to further help.
Hi I currently have KNIME not open but I grouped by the rank (created with the rank node) used the A-B-C Column as pivot fields ( = new columns) and use First as the aggregation type. The ranking was done for each group (A-B-C Column. The group can be specified within the rank node)
hope that helps. Let me know
best regards and take care
Ah yes, I did not see how much control I had over the Ranking node.
I think I have replicated your results, however I feel it might not handle duplicates values very well:
Column C reports a single 7 as equal values will share the same rank within a group.
As a workaround, I have found the RowID node which can inject the RowID into a column (Rank) - thus creating a unique tag for each value:
On larger dataset it will create a massive amount of missing values. It is not the most elegant solution but it puts the data in the exact format I need for the next step. If time allows it I’ll see if I can deal with those but it is not an issue: PRISM is the destination for this data and it can be set to skip empty cells.
Thanks for your help!
I first used the Counter Node which achieves the same result and then switched to Rank for this use case.
Glad that you find the right adjustment for your use case
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.