I’m new to Knime and I was wondering if someone can give me some hints on how to achieve the below:
So I have the ID column which can have multiple records of the same ID, representing the statuses that it’s been through. I am trying to measure the time spent by each ID in each status (number of days).
There are a couple of conditions:
For each occurrence of an ID, determine the no. of days by calculating the date difference using next row: Value[Index + 1] - Value[Index].
For example 1st occurrence of ID 1111: [Date Entered (row 4)] - [Date Entered (row 3)]
If it’s only 1 ID record (e.g. row 2) or it’s the last occurrence of an ID (e.g. row 5), check if Cleared Date exists and calculate [Cleared Date] - [Date Entered]. If the Cleared Date is missing, use the current date for calculation instead.
This is part of a longer analysis which I’ve made in Python… and now I’m required to convert it all to a Knime workflow so I just got stuck on this part.
Programmatically it seems easy, but I can’t wrap my head around Knime nodes to achieve this, so I’d be thankful for any suggestions.
There are probably more elegant ways to do this, but here’s what I did step by step:
Use a Date&Time Input node to generate the current date as a flow variable
Read in data
Convert Dates to Date&Time format
Replace any empty cells in the Cleared Date column with the current date, make sure that the resulting column is in Date&Time format
Start a Group loop, grouping on ID
Invert the order of the data, which allows for use of the previous row for calculating the difference between the values in the Date Entered using a Date&Time difference node. I couldn’t find a way to access a the next row for the calculation, so this was a workaround
If there’s only a single row, or if it’s the first row (which was originally the last row before the inversion), then the result of the Date difference calculation will be a missing value. I use another Date&Time difference node that calculates Cleared Date - Date Entered for each row
Merge the columns from the previous 2 steps, giving the results of Step 6 priority.
Clean up any negative signs
End the loop, concatenate all the results, and re-sort
I’ve tested the workflow now and it works wonderfully. Thank you very much for your time and effort.
Regarding point 6,
I was digging meanwhile and I’ve seen that it’s doable with a Java snippet node, but I’m required to avoid scripting in this (no python scripting nodes, no nothing ugh…) so your approach is clever, I like it.
As for elegance, I couldn’t care less haha the way you did it is more than fine, I’m frustrated enough by the requirements on this, so elegance is my last problem right now.