I have event data that I would like to aggregate. The original data is in the form:
C1 C2 C3
T1 P1 V10
T1 P2 V20
T1 P3 V30
T1 P1 V11
T1 P2 V21
T1 P3 V31
T2 ... ...
I would like to obtain a new column for each value of column 2 (C2). This column should contain the value in C3, aggregated by C1. Each value of C1 may use a different aggregation function. Here is an example:
C1 P1 P2 P3
T1 V10 V20 V31
I tried the "Pivoting" node, but it seems you can't chose an aggregation function for each value of C2. I then tried to decompose the problem with the node "One2Many". Columns are correctly created but their values are set to 1 (instead of value in C3).
Can anyone help?
so how many values does C1 have?
If there are only a few, I would use a row filter for each value of c1, then use the pivoting only on the subsample and finally concatenate all of the results together.
C1 is my timestamp column. So it may contain a distinct value for each hour, for example.
24 seems too much for individual row filters (at least for my taste).
I think it would work if you create a table with the C1 values and the required aggregation functions as parameters for pivoting or GroupBy and create a loop through those.
In the loop you can apply the row filter to keep only the required rows and the GroupBy can perform the aggregation.
Others might have better ideas though.
In fact C1 is a timestamp, so I give example of hours, but I mean date-hour (and I could add minutes). Let's forget about the aggregation at C1 level (I can do this in a later node).
My main concern is to map a variable domain from a column C2 into separated columns having the corresponding value of C3. Here is an example:
I would like to have the following:
Is there a node for such manipulation? All I can obtain is the following using One2Many node:
Thanks in advance for your help.
It resembles to my Pivot node. You might give it a try. (The Merge node or the Sorter might help to rearrange to the proper order.)
Disclaimer: I developed HiTS. Pivot and Merge are in the KNIME Utilities feature.
I found the way to do it: I added an ID column. Then I use the PIVOT node with my new ID column as group column, C2 as pivot column and C3 in the options (aggregation settings) using "First" as aggregation method.
Thanks for your help!