Identify consecutive 4 weeks no sale and group them

Dear Knimer,

Please need help to identify where there is no sale happen 4 consigutive weeks and group them into two groups

Customer 2020-12-11 2020-12-18 2020-12-25 2021-01-01 2021-01-08 2021-01-15 2021-01-22 2021-01-29 2021-02-05 2021-02-12 2021-02-19 2021-02-26 2021-03-05 2021-03-12 2021-03-19 2021-03-26 2021-04-02 2021-04-09 Expected result in Knime
A 1 1 1 1 1 1 1 1 1 1 1 1 Customer Row A and C should be grouped
B 1 1 1 1 1 1 1 1 1 1 Customer Row B and D should be grouped
c 1 1 1 1 1 1 1 1 1 1 1 1 Customer Row A and C should be grouped
D 1 1 1 1 1 1 1 1 1 1 Customer Row B and D should be grouped

  1. Customer Continued
  2. Customer Discontinued

Expected result in Knime

Customer Continued Customer Discontinued
A B
C D
Customer Continued Customer Discontinued
A B
C D

image

Thank you

Hi @Prag_Mor

I’ve attached a possible solution to this.

There were a couple of parts to it. Firstly identifying 4 consecutive non-sales weeks. To do this, I read the file in and replaced the missing values with zeroes, and then used Column Combiner to combine all the sales columns into a single cell.
image

This then gives you a string of ones and zeroes for each customer. As it’s a string, it’s possible to inspect each for a sequence of 4 zeroes. You could use a variety of nodes to achieve that, in a way that then informed you if it matched or not. I chose to use String Replacer and just used the wildcard *0000* as that was simple. You could use Regex. Where a match was found, it replaced the cell with DISCONTINUED.
image

A row splitter than split the two into groups based on the presence of DISCONTINUED.

The second part was how to bring the two groups back into the table you required. I chose to add a new row number (using Counter Generation) so each group’s rows were numbered consecutively. The “Customer” column name then got renamed for each group. These could then be joined back together using a Full Outer Join on “counter”. After a tidy up to remove
superfluous columns, we get the result you requested:
image

KNIME_sales_consecutive_weeks.knwf (72.2 KB)

2 Likes

Thanks a lot @takbb

Getting below warning

WARN MISSING Counter Generation 7:10 Node can’t be executed - Node “Counter Generation” not available from extension “KNIME Data Generation” (provided by “KNIME AG, Zurich, Switzerland”; plugin “org.knime.datageneration” is not installed)

and unable to execute

Ah OK, I guess that counter generation is an extension. I’m not at my pc at the moment but you can either find counter generation on the hub and install it, or you could probably replace each of the counter generation nodes with a rowid node (telling it, I think, to replace rowid in both cases and then change the join node to join on rowid) that should have the same result.

Sure @takbb will search for ‘Counter Generation’ node

If values are scattered around different column and name is same, how can we concatenate

Customer 2020-12-11 2020-12-18 2020-12-25 2021-01-01 2021-01-08
A 1
A 1 1
A 1

I used group by node but its not helping

Expected result

Customer 2020-12-11 2020-12-18 2020-12-25 2021-01-01
A 1 1 1 1

image

1 Like

If each customer’s data is spread over several rows, then after the Missing Value node, you can group all your rows by customer using the groupby node, and get the maximum value for each column. Since for any customer, a column will have a mix of 1’s and 0’s, the maximum for that column (by customer) will tell you if there it needs to be a 1 or a 0 in the column.

Attached is an updated flow. I’ve replaced the Counter Generation with RowID as previously suggested.
KNIME_sales_consecutive_weeks 2.knwf (84.3 KB)

@takbb thanks a lot i am able to complete with your help

Used constant value column and Concatenate node instead of RowID,Column Rename, Joined and Column Filter

2 Likes

If it is only about the customer identification you could do it using unpivoting the data, use a lag of 4, aggregate the lag columns and count the missings and group by the max missing. If it is 4 the customer is discontinue

data2

br

Note: I “borrowed” @takbb 's workflow because there was no data attached.
Please knime team make it compulsory that a question needs to have data attached!

2 Likes

Hi @Daniel_Weikert, good idea on the “missing counts”. Are you able to upload that workflow. I’m trying to work it through but I’m feeling lazy :wink:

@Daniel Thank you I will try your logic

KNIME_sales_consecutive_weeks.knwf (128.5 KB)
You did the heavy lifting so all credits to you Brian.
edit: You are feeling lazy? :joy:
br

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.