Hi, I'm having trouble in calculating an average.
I know it's simple using the mathformula node and the formula "average", but my problem is that I won't have always the same columns, I mean sometimes I might have 20, sometimes 24, etc.. and not always the same names on it, so what I'm trying to do is to give Knime the instruction of calculating an average from the last x-number of columns, regardless its names..
Does someone has an idea on how I can manage to do this in Knime?
Thanks in advance for your help, any comments will be helpful,
The "Column Aggregator" node allows you to select columns either manually, with wildcards or Regular Expressions, or by Type (integer, double, etc.). Would this meet your needs?
I think this might meet my needs but I just need to know about the syntax I should use in the Pattern field when using a regular expression... can I add here a math rule as if I were in a math formula node? example if I want to have the column which has the biggest value I should just write "max" or how does this works?
Before the aggregation you can use the Extract Column Header node to rename your last x columns in a consistent way, which makes it easier to do a Wildcard or RegEx selection in the Column Aggregator node. For example, if you rename your last x columns as XYZ_01, XYZ_02, ... XYZ_x you can use the RegEx pattern XYZ_[0-9]+ to match them all.
What is the criteria according to which you select how many columns you are going to aggregate (in other words the value of x)? That will determine how many columns you will need to rename before sending the table through the aggregation node.
I think I may have a suitable approach:
- Run "extract table spec"
- Sort descending on column index
- Use the sampling node to select "Top X rows"
- Transpose the result
- Use the "Reference Column Filter" (or "... Splitter") with the result as the bottom input, and the original table as the top input (make sure that "ensure compatibility" is unchecked)
- Apply your "Column Aggregator" on the resulting, filtered table with an '*' wildcard
Hope this helps.