Multiple groupby

Hi there,

I have another group by problem. I have two tables which can be represented as ;

Table 1:

Col1, Col2, Col3
ID1, Test1, 0.3
ID1, Test2, 0.2
ID1, Test3, 1.0

Table2:
Col1, Col2, Col3
ID1, Test1, 0.3
ID1,Test3, 1.0

How can Iidentify and return back the missing value from table2, that is found in table 1 i.e. ID1, Test2, 0.2 , as the other rows are found in both tables.

Thanks,

Stanage.

Hi, depends which column you want to group. In the easiest case, you referring to column 'Col2'!?! To filter based on this column, you have to move this column as Row ID for both tables (handle duplicates) and then apply the Reference Row Filter, which will remove all duplicate rows from the first table. Do the same again for the second table by using the same node but switching table1 and table2. In both case, you get the rows which are missing in one or the other table. To put them into one table use the Concatenate node.
Regards, Thomas

Hi Gabriel,

Sorry it's a bit more complicated than I described.

So in the two tables;

Table 1:

Col1, Col2, Col3
ID1, Test1, 0.3
ID1, Test2, 0.2
ID1, Test3, 1.0
ID2, Test1, 34
ID2, Test2, 56
ID2, Test3, 12
ID2, Test4. 45

Table2:
Col1, Col2, Col3
ID1, Test1, 0.3
ID1,Test3, 1.0
ID2, Test1, 34
ID2, Test4. 45

The column, col2 contents are not unique, I need to first group by col1, then by col2, then compare, to find the following missing values from table1;

ID1, Test2, 0.2
ID2, Test2, 56
ID2, Test3, 12

So I need to group by col1, col2 to compare the value in col3

Stanage.

Hi, that's why I said in the easiest case :) In your example, I don't really see the point where you need to group your column(s). I would concatenate Col1 and Col2 using the Java Snippet node (e.g.: $Col1$ + "-" + $Col2$), move this column to the RowID (remove it), and then apply the Reference Row Filter (exclude option). The problem here is that you need to filter based on the combination of both columns, I will also send you the sample flow by email. Best, Thomas

Hi Gabriel.

Ok, just another level of complication, the concatenated string col1 + col2 is not unique in table1, and so when you try the join using the joiner, then you do not get complete table1, table2 alignment. The reason for this is that you can have the following situation;

Table1:

Col1, col2, col3
ID1, Test1, 23
ID1, Test1, 45
ID1, Test2, 78
ID1, Test2, 64

etc.

Any ideas?

Stanage

Hi again, with KNIME 2.0 the Reference Row Filter allows to filter based on columns (which are in comparison to the RowIDs don't need to be unique). Cheers, Thomas