I am relatively new to Knime and I am faced with what I consider to be a very difficult task.
We have a machine park that reports interim statuses to us for each order number. There can therefore be any number of interim statuses (journal entries) for an order number.
These entries consist of the order number, a time stamp and the location of the machine.
I have now been given the task of determining the time differences between the respective entries. To make it more difficult, the difference should be determined taking into account the shift times and excluding Sundays and public holidays.
The shift times and public holidays can vary from location to location.
I therefore have three tables available as initial data:
Machine feedback
Public holidays
Shift times
In my previous workflow I have got as far as determining Sundays and public holidays, but I am not sure whether my approach is the right one. Since the machine’s data set only consists of the order number and time stamp, I had to transform these and move the time stamp of the previous entry down by one in order to be able to create a difference later. I then create a time series for each order number from the start and end dates. The time series allows me to easily check the holidays for each individual day.
I’m currently at a loss when it comes to calculating the time differences, taking into account the holidays and service times. There can be several possibilities for the time differences.
Time differences within one day
Time differences that extend over several days, and then come across Sundays and holidays + different service times
I would appreciate any help in solving my problem and am making my initial workflow including data available.
As a result, the time in minutes until the next journal entry was issued should be output for each order number line.
I downloaded your Workflow and had a look on the data. Unfortunately I’m not very familiar with machine time calculations but maybe this helps.
Do I get it right, that you want to calculate the Timedifference between the rows for each group of Column1 and Column3?
E.g. number of seconds between 2024-08-16T16:00 and 2024-08-14T12:53?
But as the machines do not run on public holidays or sundays you need to substract these days?
I created a new branch in your workflow. Maybe this makes sense.
First I calculated the overall time difference between the Start/End dates and then I check row by row if theres any sunday or public holiday in between these dates. If yes I calculate the number of seconds I need to substract to get the correct runtime.
If other corrections are needed, like max number of working hours per day, this can be added of course as well.
First of all, WOW and thank you for your fast reply & help.
The partial solution for holidays or Sundays with the loops is very impressive and shows me that I need to work more closely with loops!
Do you think this approach will also work when it comes to keeping to shift times? Unfortunately, these are even more complex to calculate, which is why I have included a calculation example as an image.
I would have made the following considerations: Calculation for the first day:
Difference between the start time and the end of the shift. Calculation for the days in between:
Only the hours of the shift can be calculated here, e.g. from 10 a.m. to 4 p.m. = 6 hours for one day. Unless it is of course a Sunday or public holiday Calculation for the last day:
Difference between the start and end time of the shift
If a start or end time is not within the shift times, the beginning of the shift should be used for the start time or the end of the shift should be used for the end time.
I don’t know how to thank you for your help.
Thank you so much for your time and effort.
I have optimized a few small things in the workflow and have therefore uploaded the workflow again as an attachment. Maybe someone has the same problems as me .
Optimized:
Group By after deleting the Sunday/public holiday.
Otherwise there were problems with the correct assignment of start and end days
an upstream IF switch node + Same Day.
The Same Day option is used when feedback is received on the same day and the logic for start + end day is not sufficient.
I’m glad that I could help you. Next time when you are in Stuttgart/Germany you can buy me a beer
The GroupBy node is to remove duplicates and order the data or? I would use the “remove duplicates” and “sort” nodes instead. So the “code” is easier to read.