Filter rows that have a count of 100 or more

TigerCole · October 16, 2017, 10:48am

Hi,

I have a CSV file with 2.1 million records. There are 25 columns, one of which contains a unique identifier (MSISDN) that I want to use as a filter variable. I want to filter all the records where the MSISDN occurs more than (x) times ... in this case x=100

In the attached image I have a "value Counter" node that counts the occurrences of MSISDN's in the column. I need to use this count as the filter for the filter (or splitter) to get the reduced dataset.

Does anyone have a workflow or some idea of how this can be achieved?

Thanks.

tC/.

Knime_MSISDN.jpg

Iris · October 16, 2017, 11:37am

You need a Reference row filter.

Therefore filter the values you want to remove and than apply the reference row filter on this column.

Cheers, Iris