Column Expression: Difference between timestamps + Identify last row

Hi everyone,

I’ve been trying to calculate the difference between timestamps and to identify the last row in the table using the Column expression node, but couldn’t find a way. Any tips how to solve this?

I have a column that contains timestamps (Date-Time) and wanted to use the Column Expression Node to calculate the difference between the timestamp in the current row and the succeeding row.

Date-Time Rank NEW
2024-09-10T17:48:53 1
2024-09-10T17:48:54 2
2024-09-10T17:48:55 3
2024-09-10T17:48:56 4
2024-09-10T17:48:57 5
2024-09-10T17:48:58 6
2024-09-10T17:48:59 7
2024-09-10T17:49 8 8
2024-09-10T17:53:01 9
2024-09-10T17:53:02 10
2024-09-10T17:53:03 11
2024-09-10T17:53:04 12 12

The goal is to use the difference between the timestamps to decide whether the value from column Rank should be assigned to column NEW.

If the difference is bigger than 1 second, the value of Rank from the current row should be written in NEW, otherwise NEW is Null. Since there is no succeeding row after the last row, by default the value of Rank should also be assigned to NEW.

if (
(column(“data_collect_timestamp”) - column(“data_collect_timestamp”, +1)) < -1
)
{column(“rank”)}
else if (
current row = last row
)
{column(“rank”)}
else {toNull(“”)}

Have a nice weekend.

Best,
Eren

Hi @Er3n

What’s the logic behind rank 8 though? :slight_smile: 17:48:59 to 17:49:00 seems like 1 second to me just like all of the other rows.

2 Likes

@ArjenEX thanks for your quick reply!

I want to calculate the difference between the current and the succeeding row. So in case of row 8, it’s the difference between row 8 (2024-09-10T17:49) and row 9 (2024-09-10T17:53:01), which is roughly 4min. Thus exceeding 1 second, and so the rank from the current row (row 8) should be written into column NEW.

Edit: Actually the difference would be minus 4min

Hi,
I don’t manage to calculate date&time differences in column expression node (and I don’t know why).
But you can use Lag Column and then use Date&Time Difference. This gives something like that :
image
Use Row filter to eliminate the last row.
Then the expression in Column Expression node :

if (column("date&time diff",+1) > 1)
        {column("rank")}
else
    if (equals(rowIndex(),rowCount()))
        {column("rank")}
       else {toNull("")}

will gives the result you want (including the question of the last row).
image

Hope this could be useful.
Best,
Joel

5 Likes

Hi @Er3n,

Sorry, was too focused on the seconds instead of also looking at the minutes :wink: The solution from @JPollet is a great alternative.

If you still want to go the CE route, you can use:

if ((new Date(column("data_collect_timestamp",1)) - new Date(column("data_collect_timestamp"))) / 1000 > 60 || (new Date(column("data_collect_timestamp",1)) - new Date(column("data_collect_timestamp"))) / 1000 == 0 ) {
    column("rank")
}  else {
   null
}
2 Likes

@JPollet @ArjenEX thanks a lot for your help! Thanks to you two it works now :slight_smile:

@ArjenEX is new Date a function within the Column Expression Node? Don’t completely understand that part of your code:

new Date(column(“data_collect_timestamp”,1))

1 Like

I think new Date is JavaScript, which you can use in column expressions :).

The if statement is also JS… you can even write your own JS functions and have them run in column expressions

1 Like

Yes, the functions that are listed in the node is just what KNIME has prepared for you :slight_smile: As mentioned by Martin, it’s build on JS so you can utilize those capabilities as well. new Date is a bit of a workaround to actually make the calculation work.

2 Likes

@MartinDDDD @ArjenEX thanks a lot for the explanation! Enjoy the rest of your weekend :slight_smile:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.