Calculate difference to previous row

Hi everyone,
I have to calculate the difference between the value of a field and that of the field in the previous row in the same column.

My data is an event log, I want to calculate differences between timestamps and the data is sorted by the timestamp column in ascending order.

In addition, I am looking for a function that works with filters or drill downs in the data set, as the events are categorised and I need to calculate the differences not simply between the rows but calculate the timestamp differences of events in the same group of categories.

Below table illustrates the data structure.
My target value is in Column E, with the expected categorised timestamp-differences. I assume there will be a NULL value whenever there is no previous row available.

As I have a set of close to 2 mill rows with more than 1,000 possible groups / category-combinations, manual splitting of the data by these groups and applying the function thereafter would not be practical.

Can anyone help?
Many thanks in advance,
Thomas

COL A	COL B	COL C	COL D	COL E
ID	timestamp	Cluster1	Cluster2	TimeInterval

ROW 1 1 19.10.2015 20:12:00 A x [NULL]
ROW 2 2 19.10.2015 20:16:00 A x B2-B1
ROW 3 3 19.10.2015 20:17:00 A x B3-B2
ROW 4 4 19.10.2015 20:22:00 B x [NULL]
ROW 5 5 19.10.2015 20:23:00 B x B5-B4
ROW 6 6 19.10.2015 20:25:00 B z [NULL]
ROW 7 7 19.10.2015 20:27:00 B z B7-B6
ROW 8 8 19.10.2015 20:29:00 B x B8-B5
ROW 9 9 19.10.2015 20:30:00 A z [NULL]
ROW 10 10 19.10.2015 20:32:00 A z B10-B9
ROW 11 11 19.10.2015 20:33:00 A z B11-B10
ROW 12 12 19.10.2015 20:34:00 A z B12-B11
ROW 13 13 19.10.2015 20:34:00 A z B13-B12
ROW 14 14 19.10.2015 20:38:00 A z B14-B13
ROW 15 15 19.10.2015 20:38:00 A x B15-B3
ROW 16 16 19.10.2015 20:39:00 A x B16-B15
ROW 17 17 19.10.2015 20:41:00 A x B17-B16
ROW 18 18 19.10.2015 20:42:00 B z B18-B7
ROW 19 19 19.10.2015 20:44:00 B z B19-B18
ROW 20 20 19.10.2015 20:46:00 A y [NULL]
ROW 21 21 19.10.2015 20:52:00 A y B21-B20

1 Like

Hi @ThG2020 and welcome to KNIME Forum

I would suggest to use a the Lag Column node within a GroupLoop, see date_time_diff_group_loop.knwf (43.3 KB)


gr. Hans

4 Likes

Hi Hans, many thanks, this was precisely what I was looking for. I had not considered the loop functions, but this does the trick!
Kind regards
Thomas

3 Likes

Glad I could help. Happy KNIMEing

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.