There were a couple of parts to it. Firstly identifying 4 consecutive non-sales weeks. To do this, I read the file in and replaced the missing values with zeroes, and then used Column Combiner to combine all the sales columns into a single cell.
This then gives you a string of ones and zeroes for each customer. As itās a string, itās possible to inspect each for a sequence of 4 zeroes. You could use a variety of nodes to achieve that, in a way that then informed you if it matched or not. I chose to use String Replacer and just used the wildcard *0000* as that was simple. You could use Regex. Where a match was found, it replaced the cell with DISCONTINUED.
A row splitter than split the two into groups based on the presence of DISCONTINUED.
The second part was how to bring the two groups back into the table you required. I chose to add a new row number (using Counter Generation) so each groupās rows were numbered consecutively. The āCustomerā column name then got renamed for each group. These could then be joined back together using a Full Outer Join on ācounterā. After a tidy up to remove
superfluous columns, we get the result you requested:
WARN MISSING Counter Generation 7:10 Node canāt be executed - Node āCounter Generationā not available from extension āKNIME Data Generationā (provided by āKNIME AG, Zurich, Switzerlandā; plugin āorg.knime.datagenerationā is not installed)
Ah OK, I guess that counter generation is an extension. Iām not at my pc at the moment but you can either find counter generation on the hub and install it, or you could probably replace each of the counter generation nodes with a rowid node (telling it, I think, to replace rowid in both cases and then change the join node to join on rowid) that should have the same result.
If each customerās data is spread over several rows, then after the Missing Value node, you can group all your rows by customer using the groupby node, and get the maximum value for each column. Since for any customer, a column will have a mix of 1ās and 0ās, the maximum for that column (by customer) will tell you if there it needs to be a 1 or a 0 in the column.
Attached is an updated flow. Iāve replaced the Counter Generation with RowID as previously suggested. KNIME_sales_consecutive_weeks 2.knwf (84.3 KB)
If it is only about the customer identification you could do it using unpivoting the data, use a lag of 4, aggregate the lag columns and count the missings and group by the max missing. If it is 4 the customer is discontinue
Note: I āborrowedā @takbb 's workflow because there was no data attached.
Please knime team make it compulsory that a question needs to have data attached!
Hi @Daniel_Weikert, good idea on the āmissing countsā. Are you able to upload that workflow. Iām trying to work it through but Iām feeling lazy