Lag Based on Specified Order

Hi there,

Is there a way to Lag based on a specific order? Please see example. I am able to use a loop and lag based on the group. The problem is that I need the Lag to happen based on the event time and the group. In the screenshot Line 23, the lagged even time should be blank, but it seems to be pulling the time from a prior group. The desired output is the last column.
KNIME - Sample Data.xlsx (9.5 KB)

I am attaching sample data.

Not sure whether I got the challenge correct.
Would you like to see the previous time of GM or CL in the blank cells?
Then ist might help to change the order sequence: First group, then time.
But you will not get a first entry.

So,

The idea is to Lag the Event Time based on the groups, but should be in the order of the Event Time. I have color coded the groups that should go together based on the event time.

Hi etorres182,
I had the same requirement: lag a column separating data in groups.
I didn’t use a loop (same behaviour you noted), instead I used:

  • RANK NODE: it gives you the ability to treat data in groups
  • COLUMN EXPRESSION: to calculate lag column
  • JOINER: to (self-)join the table in order to have column & calculated lag column on the same row.

image

I know that it’s not so simple, but it works well with me.
Hope this helps.

2 Likes

@cristina_piussi

Thanks for the help! I am still having a bit of trouble getting it to work. I don’t know if you mind taking a look at what I am doing wrong. Would appreciate it if you can take a look at it. I am attaching a copy of the sample flow.

Thank you!
Sample KNIME FLOW.knar.knwf (29.0 KB)

@etorres182
I used the workflow to see the data but it’s using data from your personal. folder :frowning:
What exactly is the input you are processing?
What is the expected result that you would like to have?
What data shall be filled in the yellow cells?

KNIME - Sample Data.xlsx (10.0 KB)

Sorry. Here is the data.

In the example below, I provide a table with my current output [Lag (Current)] and my desired output [Lag (Desired)] columns. I am interested in creating a lag for the event time based on event time order, need to make sure that the Groups do not get out of order. The issue is that Group CL1 is currently not being lagged based on time, its based on the group, so that is where I am struggling.

Personal ID Event Time Groups Lag (Current) Lag (Desired)
99999999 4:18:58 PM GM1
99999999 4:21:03 PM GM1 4:18:58 PM 4:18:58 PM
99999999 4:21:42 PM CL1
99999999 4:25:04 PM CL1 4:21:42 PM 4:21:42 PM
99999999 4:25:40 PM GM2
99999999 4:25:45 PM GM2 4:25:40 PM 4:25:40 PM
99999999 4:26:14 PM CL1 4:25:04 PM
99999999 4:26:23 PM CL1 4:21:42 PM 4:21:42 PM

@knimediger, please see response. Thanks for your help!

Hi @etorres182

Before you can use the GroupLoop you have to identify the records where the value of Groups switches. See this wf lag_based_on_specific_order.knwf (71.0 KB)


gr. Hans

3 Likes

@etorres182
Thanks for the additional information.
That Data is at first glance slightly different from your previous data.
My solution would be:
Sort the data by datetime and apply a lag with group and Datetime.
Then check whether the group did change. In this case clear the lagged datetime.

You could also use Counter Generation to record the necessary processing order, then do a group loop (on Groups) + sort on counter generation + lag, then just sort the combined data by your Counter Generation order after your loop to insure you return to input row order. It probably won’t process as fast, but it will help you apply your logic and see results in a very simple way.

@HansS , that worked exactly as expected. Especially since I have millions of records that need to go through that. Worked perfectly and learned something new. Thank you!

Thank you for everyones support. Really appreciate the help.

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.