How can I apply a filter within a moving window loop?

Hello ,
I need some help for a possible solution.

I have a simple table with 6 columns of interest:

  1. subject
  2. stay_id
  3. date_time_event
  4. start_time
  5. stop_time
  6. value

Possible values can be:

1 1000 2020-09-10 23:43:00 2020-09-10 20:00:00 2020-09-18 20:00:00 0.7
1 1000 2020-09-11 03:26:00 2020-09-10 20:00:00 2020-09-18 20:00:00 1
1 1001 2020-09-14 02:01:00 2020-09-12 12:52:00 2020-09-21 19:36:00 2.1
3 1002 2020-10-11 15:45:00 2020-10-01 12:51:00 2020-11-21 20:00:00 0.9
3 1002 2020-10-28 10:34:00 2020-10-01 12:36:00 2020-11-21 20:00:00 0.8
4 1003 2020-11-11 15:41:00 2020-11-01 12:00:00 2020-11-20 20:00:00 2.7

The purpose is to calculate if there’s an increase of the value of at least 0.3 within intervals of 48h.
So basically I want to group by (subject and stay_id) and then I would like to apply a Window Loop Start time-based (window of 48 hours and step size of 1 hour) verifying that all values retrieved in that 48h satisfy the condition max(value_within_48h) - min(value_within_48h) >= 0.3
The loop should start when date_time_event >= start_time and stop when it is <= out_time.
What I have done til now is to Group By subject and stay_id and List(sort) all date_time_event per group. It would be great now to apply the Window Loop Start on this list, but this can’t be done because of the different type.
The moving window should verify if the increase has occurred, once this is TRUE, I put a new column FLAG with value 1.
What I should output is the following:
subject stay_id date_time_event flag
1 1000 2020-09-11 03:26:00 1
3 1002 2020-10-11 15:45:00 0
4 1003 2020-11-11 15:41:00 0

Thank you for any ideas.

Hi izbll,
first of all you should provide us a table with proper data like the uploaded one: we are here to help each other :wink:
table.xlsx (3.6 KB)

then, try to esplict more in detail your case: it’s not clear how to handle the dates, the checks to be made and so on.

Luca

2 Likes

Hi Luca, thank you so much. Here’s the updated table.
table.xlsx (8.9 KB)
What I want to do is:

  1. while date_time_event <= stop_time

    apply a moving window time-based with window length of 48hour and step size 1 hour (window loop start)
    at each interval of 48h considered, check all the values grouped by subject and stay_id. Whenever there is an increase of at least 0.3 of the values, put a flag of value 1 in a new column.
    I am not interested in unique subjects, that have only one row, one data_time_event, i.e subject 4.
    I am looking for subjects that have more than 1 data_time_event, and per each subject-stay_id group, I want to retrieve the min and max value within each 48 hours window and see if the max(value)-min(value)>= 0.3. At least, this was my approach so far.

What I want to show then is the output grouped by stay_id and subject displaying also the flag column.

Thank you again, I hope it’s more clear now.

48 hour windows based on what field? date_time_event?

Yes, sorry.
I tried the following workflow:

  1. group loop start >
  2. window loop start (time based, 48 hour window with step size 1 hour) >
  3. group by (subject and stay_id, with aggregation min(value) max(value) per group) >
  4. math formula (checking for max(value)-min(value) >=0.3) >
  5. loop end
  6. loop end

the collected results are very redundant and incorrect.

i’ve done exactly your workflow…
windows.knwf (467.0 KB)

so, what is the problem?

1 Like

I had to deal with and process hundred of thousands entry rows and it took me a lot of time to implement this loop example. I also got a lot of redundancies and I was wondering if the results were correct, but I think I fixed it.

Thank you so much.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.