I'm quite new in Knime and I'm currently trying to calculate the order frequency in days at a user level and I didn't find how to do it exactly with Knime. Could you please help me? This is how I have the data:
User ID
Order Date
111222
11/01/2015
111222
25/04/2015
111222
05/04/2016
222222
25/01/2014
222222
02/06/2015
222222
15/08/2015
222222
02/06/2016
333333
01/02/2016
333333
18/03/2016
333333
05/05/2016
I added a 'lag column' with a lag of '1' for the 'Order Date', but I need to do it at a User ID level. How I have to continue? Maybe should I use another node different to the lag one?
Hi, welcome to KNIME! I am sure you will love it as you learn along.
Please see the attached workflow which should solve your task.
It uses a Group Loop node to cycle within the order dates for each user, calculate the difference in days between those dates, finally calculate their mean. The End Loop node aggregates back the result for each User, producing the final table.
Note that in order to use the Time Difference node to calculate the difference in days between orders dates need to be converted first to a Date Time data type.
There are of course other ways to achieve the same result, for example using a Java Snippet node, but this one is should be pretty straightforward to you.
If I understand the task, you expect a file with one row per user and the number of different days with order for each customer ?
111222 : 3
222222 : 4
The most straightforward way is to use the "GroupBy" node : aggregate at User_ID level and pick the column date with the Count or distinct count method (depending if you want to count twice the same date as one or as two).