Best way to calculate

Hello,

I am trying to calculate the time difference between Ship Date to Status Date, but I want calculate the difference for each group of equipment ID by ship date below. So, calculate the time difference if the ship date and equipment ID are the same, but as soon as both of those change, then start calculating the time difference again. I know that I need to use a loop of some sort, but I cant seem to find the right now.

Here is the example data:

Car ID|Status date|Status Code|Ship date

|ACFX099426|2019-10-09T11:18|R|2019-10-08T01:25|
|ACFX099426|2019-10-12T01:10|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T01:26|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T01:56|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T03:11|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T04:05|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T04:23|A|2019-10-08T01:25|
|ACFX099426|2019-10-12T04:24|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T04:50|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T06:37|A|2019-10-08T01:25|
|ACFX099426|2019-10-12T06:45|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T07:42|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T08:18|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T08:30|A|2019-10-08T01:25|
|ACFX099426|2019-10-12T08:48|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T09:13|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T09:45|A|2019-10-08T01:25|
|ACFX099426|2019-10-12T09:46|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T09:49|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T10:29|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T11:14|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T11:50|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T11:53|A|2019-10-08T01:25|
|ACFX099426|2019-10-12T11:54|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T12:38|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T12:48|P|2019-10-08T01:25|
|ACFX099426|2019-10-12T12:58|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T01:05|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T02:06|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T02:36|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T04:41|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T04:53|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T05:33|P|2019-10-08T01:25|
|ACFX099426|2019-10-13T05:43|A|2019-10-08T01:25|
|ACFX099426|2019-10-13T12:37|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T05:27|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T05:42|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T06:11|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T06:52|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T08:25|A|2019-10-08T01:25|
|ACFX099426|2019-10-14T08:29|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T09:08|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T10:11|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T11:12|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T11:27|A|2019-10-08T01:25|
|ACFX099426|2019-10-14T11:28|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T11:49|P|2019-10-08T01:25|
|ACFX099426|2019-10-14T11:58|P|2019-10-08T01:25|
|ACFX099426|2019-10-15T02:15|A|2019-10-08T01:25|
|ACFX099426|2019-10-15T02:19|H|2019-10-08T01:25|
|ACFX099426|2019-10-15T10:57|9|2019-10-08T01:25|
|ACFX099426|2019-12-19T05:24|R|2019-10-08T01:25|
|ACFX099426|2019-12-21T11:31|J|2019-10-08T01:25|
|ACFX099426|2020-01-07T03:23|R|2019-10-08T01:25|
|CRGX029170|2019-10-15T03:36|R|2019-10-08T01:30|
|CRGX007491|2019-10-18T02:12|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T02:25|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T03:11|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T04:27|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T05:11|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T06:25|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T06:40|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T06:47|A|2019-10-08T01:30|
|CRGX007491|2019-10-18T06:50|J|2019-10-08T01:30|
|CRGX007491|2019-10-18T07:49|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T08:30|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T10:15|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T11:55|P|2019-10-08T01:30|
|CRGX007491|2019-10-18T12:27|A|2019-10-08T01:30|
|CRGX007491|2019-10-19T01:01|R|2019-10-08T01:30|
|CRGX007491|2019-10-19T04:37|R|2019-10-08T01:30|
|CRGX007491|2019-10-19T04:37|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T05:07|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T06:54|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T07:26|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T07:44|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T08:58|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T09:49|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T10:48|P|2019-10-08T01:30|
|CRGX007491|2019-10-19T11:05|A|2019-10-08T01:30|
|CRGX007491|2019-10-19T11:11|H|2019-10-08T01:30|

Hello @Einayyar,

if I got it right and you are interested in difference between two dates that are in same row you can use Date&Time Difference node:

Br,
Ivan

1 Like

Thank you for that information, I am using that node but I want the time difference to stop calculating once the Equipment ID and ship date change and then start calculating again. I hope that makes sense.

Hello @Einayyar,

not really. Can you say what is expected output in above example?

Br,
Ivan

1 Like

It seems to me that this could be accomplished by using the Date&Time Difference node inside of a a GroupBy loop. You’d group by Equipment ID and ship date.

1 Like

What I would expect for a result is the ACFX car mentioned above to take the difference between the Ship date to the status date, then every row after that calculate the difference between each of the status dates from the row above all the way until the ACFX car # changes to the CRGX car.

So row one would be a difference of a day then row two would be a difference of 16 mins. Row three would be a difference of 30 mins, etc. Then when the CRGX comes with a different ship date change, the difference calculation starts over again. CRGX29170 would be difference of 7 days. Then CRGX7491 would be a difference of ~50 minutes etc.

1 Like

Thank you for that. I will try it again. Last time it did that, it only gave me one result.

This gave me what I needed, but only did it for one car so I ended up with only 119 rows when I had 6802938 to start with. Any thoughts on getting around that?

Hi @Einayyar , it is still not clear what you are trying to achieve.

Based on what you explained, from what I understand, the car ID should not matter then. You just need to know what’s the difference between the column Status date and the column Ship date of each row, correct?

Like you said, first row will give you around 1 day, and second row will give you 16mins etc. It sounds like you want to just treat each row separately on their own. In that case, you don’t need a loop. You just need to use the Date&Time Difference node, and Knime will automatically process all the rows for you.

You just need to decide what’s the granularity of the difference (days, hours, minutes, weeks, etc). You should go to the lowest granularity that you need. For example, if you want to see days, hours and minutes, then you should get all the differences in minutes, and then you can convert the results into hours and days. If you go higher than minute, for example hours, you will get 0 hour for the second row as 16 minutes is less than an hour.

EDIT: Just to add… If you want to know the total of the differences per Car ID, you still need to do the above, and then you can do a groupby on the Car ID, and apply a SUM on the differences. Again, you might need to do the sum as minutes first, and then convert to hours/days/weeks after.

1 Like

I am sorry I am not being clear. The attached is what is what I would expect. Not that a 0 appears once a new equipment ID and ship date appear. I would do a time differential to get the duration but I would mark it as comparing the status date before and the difference between the two. Once the equipment ID and ship date changes it shouldn’t compare to the row above it.Knime Example.xlsx (28.4 KB)

Hello,

I was able to figure this out. I just used the group by node and I was (in error) thinking that it was only applying the calculation by the first equipment ID, but with the end loop node it was then applying to all groupings.

I will try to be more clear the next time I need assistance. Thank you all!

1 Like

Hi @Einayyar , thank you for the file.

So, from what I see, you just need the column equipment_number and status_date, correct? The column ship_date is not used, is that correct?

You just want the difference between each row of the status_date column, and in hours and minutes for granularity, and that, restarting for each equipment_number, is that correct? That’s what I see your column E doing based on the formula.

The best node to use here is the Lag Column node.

Here’s something I put together that looks like this:
image

And here are the results:

Note: Don’t mind the 1900-01… in the date&time diff column, it’s how Knime is interpreting the “time” for > 24:00 from your file - it basically sees it as an invalid time, anything greater than or equal to 24:00 is an invalid time, although I tried to enforce it to read it as a string.

The one I generated is the next column, the “New date&time diff”, which you can compare with it.

It gives the same results.

Here’s the workflow: Calculate date difference between each row of CarID.knwf (43.3 KB)

EDIT: After I posted my workflow, I saw what you posted, and I was curious about how you were calculating the difference between rows, as it is most of the time not possible within a node. I found out that the Date&Time Difference has an option to calculate between the current row and the previous row! That’s a nice feature. This means that I don’t need to do a Lag for the status_date anymore. I want to keep the one on the CarID because it will help me detect a change in the CarID.

Updated workflow looks like this now:
image

image

And here’s the updated workflow: Calculate date difference between each row of CarID.knwf (41.8 KB)

I would say ditch the loop. It’s unnecessary here. You would want to use a Group Loop if you are doing things per group, like sum for each group for example. Here, you just want to initialize the first difference to 0, so you just want to detect where that change is happening, you don’t need a loop to do this. The loop is much slower and is not needed.

1 Like

Thank you so much Bruno! I will trop the loop node then and give that a shot.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.