Calculation with values previous row (generated by a formula) and current row

i am supposed to generate the last 4 columns (referring as result columns E to H) using the data from the first 4 columns (referring as data columns A to D).
The first row in the result column is copied from the first row of the data column.
The formulas for column 5 and 6 (grouped_create_date and grouped_id) is as below from excel.
Formula is cell E5 = =IF(AND(AND($C2=$C3,$D2=$D3),$A3-$E2<30),$E2,$A3)

Formula is cel E6 = =IF(AND(AND($C2=$C3,$D2=$D3),$A3-$E2<30),$F2,$B3)

Cell E(n) and Cell F(n) is a copy from Cell C(n) and Cell D(n).

Data is as below.

create_date id accountid country grouped_create_date grouped_id accountid country
6/2/2016 OPP1 A1 Australia 6/2/2016 OPP1 A1 Australia
5/25/2017 OPP2 A1 Australia 5/25/2017 OPP2 A1 Australia
5/26/2017 OPP3 A1 Australia 5/25/2017 OPP2 A1 Australia
5/26/2017 OPP4 A1 Australia 5/25/2017 OPP2 A1 Australia
9/22/2017 OPP5 A2 Netherlands 9/22/2017 OPP5 A2 Netherlands
9/22/2017 OPP6 A2 Netherlands 9/22/2017 OPP5 A2 Netherlands
12/27/2017 OPP7 A2 Netherlands 12/27/2017 OPP7 A2 Netherlands
2/16/2016 OPP8 A3 China 2/16/2016 OPP8 A3 China
2/17/2016 OPP9 A3 China 2/16/2016 OPP8 A3 China
3/14/2016 OPP10 A3 China 2/16/2016 OPP8 A3 China
4/11/2016 OPP11 A3 China 4/11/2016 OPP11 A3 China
4/25/2016 OPP12 A3 China 4/11/2016 OPP11 A3 China
6/2/2016 OPP13 A3 China 6/2/2016 OPP13 A3 China
3/2/2016 OPP14 A4 Australia 3/2/2016 OPP14 A4 Australia
3/2/2016 OPP15 A4 Australia 3/2/2016 OPP14 A4 Australia
3/16/2016 OPP16 A4 Australia 3/2/2016 OPP14 A4 Australia
5/19/2016 OPP17 A4 Australia 5/19/2016 OPP17 A4 Australia
6/24/2016 OPP18 A4 Australia 6/24/2016 OPP18 A4 Australia

I am able to use the lag column and rule engine node. To generate the nth row in the result column (5 and 6), how do I store the variables from (n-1)th row so that I can use in the formula above?

I saw a post on Java snippet variable (understood that to be a substitute of lag column). Saw on recursive loop but I am not sure how to apply here.

Appreciate your time.

Thanks,
Shalini

Hi,

To get data from the rows (N) and (N+1) into the same row, you only need Moving Aggregation node with appropriate settings - see attached workflow. Then you are able to apply any rule/math operation.

Martin K.

image

Moving_Aggregation.knwf (3.7 KB)

2 Likes

Hi @shalinirs,

If I understood your problem correctly, you can either calculate the values using the current and the previous row values (which is a bit tricky to do, but possible, so please let me know if another option doesn’t work for you for some reason) or using the current row and the next row values.

For example in your formula you’re using the values from the current row (C2, D2…) and from the next row (D3, A3…).

In the latter case you can use the Lag Column node with step = 1 or as @Martin_K suggested, the Moving Aggregation node with window length = 2, Forward window type and the “Last” aggregation method.

Best,
Anna

1 Like

Hi,

I tried the moving aggregation node. However, I needed to calculate the values in Column 5 (grouped_create_date) for every row taking values from the current and previous row (which I did through the Lag column) and taking the result from previous row in Column 5.
I managed to do with a python node that iterates over a pandas dataframe and updates the values for the result column depending on the current and previous row value.