KNIME people,
I have a dataset (image above) where I need to be able to determine if there is an overlapping of events based on multiple conditions. What I currently do to obtain my results is manually selecting all items of only one “company”, then selecting all occurrences of one same “short description” and after applying those two filters, sorting by “sys_created_on” from oldest to newest. With this first part, I have everything set up to start marking those lines that overlap (see the example marked in green).
Correct exercise: (marked in green) since the company is always “Public Storage” AND Short description is always “Host::PSA026647_PM_0015::Down” the second occurrence started at 02:21 while the first occurrence ended at 03:08, there is overlapping. This is what I need to mark automatically.
Incorrect exercise: (marked in yellow) since the second occurrence happened at 00:39 while the previous occurrence was at 04:36 but “company” are different and “short description” are different from each other as well.
Summary: I need to be able to mark overlapping of events taking into consideration 3 or more variables (“company” & “short description” and “created on”. I am attaching my current Knime Example.knwf (45.4 KB) where I am able to mark overlapping using Lag Column node, but so far I can only make it work based on just “created on” and not taking into consideration just the same company and the same short description. (Incorrect exercise marked in yellow)
One way I have it working is manually applying row filter nodes to filter out just one company and one short description at a time, but those companies and short description will increase over time, and then it doesn’t seem to be a practical way to automate this.
Thank you in advance for any guidance.