I'm trying to do some basic Math on values in a column. Say for example I have a data set:
RowID Value1 Value2
I need to find the difference between adjacent values in the "Value2" Column i.e 34-32=2. I understand Column Math is straightforward, but whats the best way to go about doing Row wise Math?
Thanks a lot!
Not as easy as the standard column calculations!!
You could transpose the table with "Transpose" node, and now your rows are columns! So you can use the Maths node like you would normally.
Then simply transpose back again.
Alternatively if you may have more than 2 rows, the above procedure becomes more difficult, instead you could use the Delegating Loop Start and Delegating Loop End nodes. This involves Variables and Loops so I hope you are familiar with using these. The plan is to pass the information to the next row for the calculation to be done.
First generate a new column called "Subtraction column" and fill it with zeros. THis is easily done with the Maths node, just use 0 as the Maths expression, then;
Add a Delegating Loop Start.
Now use a Row Splitter in this loop, passing the first row at the top out port only, the rest go to the bottom port. Then with the one row data from the first outport you convert your original number column to a variable with "TableRow To Variable", and then use an "inject variables" node to pass the variable into, the data in-port to this inject variables node should come from the second thread of data earlier (i.e. all but the 1st row which is the second out port of the row splitter). This is so that the number from the first row can now be in the next row. Going back to this top data stream (the one with just one row), now do a Maths node and do the calculation of "Subtraction Column"-"Original Number Column" and append a new column called "Result", attach this to the "Delegating Loop End" node at the top port (Data from the top port is collated together at the end).
Now going back to the second data stream after the "Inject Variable" node, we want this variable as the subtraction column. So first use a column filter to remove the original "Subtraction Column" which just has zeros in it (we have this column of zeros for the very first row in the table so the Maths calculation will not fail earlier), now use a Maths node to generate a new "Subtraction Column" with the same name, use the variable as the Maths Expression. Hey presto we now have the data from the first row on the second (and any other rows but dont worry about this). Now pass all this to the Delegating Loop End second port (the second port gets recycled back to the loop start for processing now with the data from the first row and this second row now becomes the NEW first row if you know what I means).
This should do it, apologies its a little complicated.
I am also facing similar situation. Thank you for your solution. I appreciate your both ideas: the transpose and the looping. I have done this in another way i.e.
when the Lag Column appends the shifted version, then the operation simply becomes the difference of two cells in the same row.
Having said that, I still believe that we should have better approach for computing multiple (neighbouring) rows. Something similar to GroupBy but not exactly the "grouping" operations.
Do you know about the Moving Aggregation node?
With a window length of 2 and the option Range you get the difference between the previous and the current row.
Thanks for your quick reply, and more interestingly, the idea of using Moving Aggregation for this purpose.
KNIME is amazing :-)
I too like moving aggregation of window size 2, but range does not give you the difference but the absolute difference (sign/direction is lost). Lag column and math node substraction works but is a hassle if you have multiple columns. I have the feeling I am overseeing something - feature request or any hints are appreciated, thank you!
the way I do this is use the Moving Aggregation node on a window length of 2, pick for the same column the "First" and the "Last", this gives me two more columns for each row, with the first and the last, and then send the results to a Math Formula to do the calculation. If there are a lot of multiple columns I use a column list loop to cycle through each. If I need to do a calculation on say the previous 10 number, then I use the moving aggregation and select list to create the list of numbers, and use a java snippet to read in the list for each row and do the calculation.
Instead of Transpose, there is also Pivot/Unpivot ...
As I have 1M+ rows (time series data), so neither transpose nor pivot/unpivot is a viable option for me. The closest I get to a knime like solution is to work with a column list loop and some regex rename and laging. I also added some quickform nodes for convenience. Together with a wrapped meta node and the “save as template” option this works well for me until I have something better (way of accessing a previous value in math node or similar): Multi Column Manipulation.knwf (102.0 KB)
I think Lag Column node could be useful for the task.