Categorize clients who purchase 3 or >3 orders within 7 days

Dear All,

I have a table shows the returning days among orders for each client. And I would like to use KNIME to create a column to identify which clients purchase 3 or more than 3 orders within 7 days.

For instance, Client purchased 3 orders, the returning days between 1st and 2nd order is 1 day, and 2nd order and 3rd order is 2 days. Thus this client purchased 3 orders within 7 days, and should be categorized as “yes” for the new column.

I was thinking about using the node to check the returning days column which contains 1,6 or 2,5 or 3,4 or 4,3 and so on, but I am unable to do so.

Please see below the table. Look forward to hearing from you, thanks.

Client Number of orders Returning Days (interval) between orders To be created by KNIME - Purchase 3 or more than 3 orders within 7 days
A 3 1,2 Yes
B 4 12,9,7,2 No
C 3 15,8,2 No
D 2 2,10 No
E 1 3 No
F 7 2,5,10,12,6,3 Yes
G 4 22,3,2 Yes
H 2 1,20 No

As a fast solution you can use
string manipulation node with expression
count(replace(\$column2\$, “,”," " ), “1”, “w”)
+count(replace(\$column2\$, “,”," " ), “2”, “w”)

• …+count(replace(\$column2\$, “,”," " ), “6”, “w”)
It will add new column with number of occurrence of numbers lover than 7.
Then use Rule-based Row Filter to get rows with New column value > 1 and Number of orders > 2.

Thank you for your suggestion.

I tried the solution, but it seems it will also count those clients who purchased less than 7 days for the 1st to 2nd order, more than 7 days for the 2nd to 3rd order, and less than 7 days for the 3rd to 4th order. But this client I would like to remove as this client didn’t purchase 3 orders within 7 days in his journey, basically this client didn’t has two intervals together (sum of both < 7 days).

These are the results I used your solution. For instance, Client A should be removed, even though he has one order for 1 day and 1 order for 2 days, but these orders are not followed by each other. Client C is fine, as he had a 1,5 interval pattern in his journey (1+5 is less than 7). Client D should also be removed, as 1+7 is more than 7 days.

Please help, thanks!

Client Returning days among orders Number of orders New column
A 1, 35, 91, 15, 2, 98 6 2
C 15, 60, 154, 32, 1, 8, 95, 26, 221, 69, 2, 54, 241, 58, 56, 47, 25, 34, 1, 5, 207, 116, 26, 312, 183, 3, 134 27 5
D 1, 7, 98, 272, 65, 15, 1500 7 2

I have a hard time understanding you logic since the examples seem not to match the rules you describe. I tried a few things but did not came up with a solution for the examples.

1, 35, 91 what would that mean

1 = 1 day between purchase 1 and 2 then 35 days between purchase 2 and 3 and so on? And now you just want to know how many purchases within 7 days were there? If so I set up a workflow that is maybe not very elegant but you can manipulate it further. It takes every single line and splits the values into separate columns and then loop over the lines to apply some rules.

Maybe you could adapt the example to your needs.