Filter rows that have a count of 100 or more



I have a CSV file with 2.1 million records. There are 25 columns, one of which contains a unique identifier (MSISDN) that I want to use as a filter variable. I want to filter all the records where the MSISDN occurs more than (x) times ... in this case x=100


In the attached image I have a "value Counter" node that counts the occurrences of MSISDN's in the column. I need to use this count as the filter for the filter (or splitter) to get the reduced dataset.


Does anyone have a workflow or some idea of how this can be achieved?





You need a Reference row filter.

Therefore filter the values you want to remove and than apply the reference row filter on this column.

Cheers, Iris