Detect change over time - a question

Hi all!

Being relatively new to KNIME and being fairly successful for the past weeks in working myself into it, I am finally struggling and also search in Google and the forum did not point me in the right direction. So I kindly ask you as the more advanced users for help.

I do have a table that looks like this (sorry, cannot share the table/workflow here, as it contains company information)

Key date | Person
2020.01 | A
2020.02 | A
2020.03 | A
2020.04 | A
2020.05 | A
2020.01 | B
2020.02 | B
2020.01 | C
2020.02 | C
2020.04 | C
2020.05 | C
2020.02 | D

Basically I do have a data snapshot every month (“key date”) and a person’s ID (“Person”).
A person does not necessarily exist in every snapshot.

–> I want a new column indicating whether that person in that snapshot is either “new” or “left last month”.

A person is

  • “New”, when she/he appears in a snapshot but not in the previous snapshot. First snapshot is ignored
  • “left last month”, when she/he has been present in the last snapshot but not in the current one

Result table shall look like this:
Key date | Person | Change indicator
2020.01 | A | -
2020.02 | A | -
2020.03 | A | -
2020.04 | A | -
2020.05 | A | -
2020.01 | B | -
2020.02 | B | -
2020.03 | B | left last month
2020.01 | C | -
2020.02 | C | -
2020.03 | C | left last month
2020.04 | C | new
2020.05 | C | -
2020.02 | D | new
2020.03 | D | left last month

I get that I generate a list of all available snapshots with a “Group By” on “Key Date”. I could use that list as variable to loop through every snapshot. But how do I formulate a rule comparing current key date with previous key date, I am thinking of a matrix Key date over Person and somehow accessing items in that matrix via row/column index? Or is there anyways a smarter way to approach this?
Especially tricky seems to be that there is no data set for a person in the snapshot AFTER that person has left. So it needs to be “created”. Simple in my matrix thinking that later only needs to be “un-pivoted”…

Thanks a lot for your support and ideas. I am lost right now :blush:

Hi @jsbitz, welcome to KNIME Forum

I created this wf detect_change_over_time.knwf (60.0 KB)


It uses a table with reference dates to create new records in your data. My workflow doesn’t match exactly your expected output, because every Person has an indication of “new” even if they start at 2020.01. But anyway I hope it gives you some inspiration and direction for a final solution.

gr. Hans

4 Likes

Thanks, @HansS, for your swift response. I guess I get the idea. Will provide more comprehensive feedback later or tomorrow but I am positive that this leads the way to solve my challenge. THANKS A LOT !!!

1 Like

Thanks, @HansS.

Your approach works fine, I just adapted it to my needs, joined all the persons’ data also to the data sets when they have already left etc. Brilliant! :smiley:

THANKS A LOT for your time and brain!

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.