Working with Sub Groups

All,

I'd like to be able to split my data into groups based on values in a specific row, and then analyze each group independantly.  I can use GroupBy to calc basic stats once beyond the basics there isn't much I know how to do.

Use Case - 1 - I have a 16x24 array, for each row I'd like to be able to delete the max and min value and then compute the mean.  Similarly for columns

Use Case 2 - I have an array with 32.000 rows and 4 columns Plate, Row, Col, Value.  I'd like to be able to compute the mean across Plate for all rows, cols and values. E.G mean all values in row1,col1 across all plates, mean all values in Row1 across all plates etc.

I know how to do this in another popular tool, but I'd like to do this in Knime

 

Jay

Hi Jay,

I am not sure about the second problem, but I think the first one can be solved by using the following nodes:

  • Create Collection
  • Ungroup
  • Group
  • Math Formula (Java Snippet)

The trick is to ungroup the numeric data columns first, and then group it again with different aggregation methods, such as mean, min, max and count. Those statistical values can then be used to compute the new mean - without min and max values. Sorry, we don't have a single node for this problem; only the scripting nodes like Java Snippet, Perl, and of course R will allow you to do the operation in one step.

Hope this help?

Regards, Thomas