Retrieve data that has been repeated three times or more in specific column

ihisawi · June 20, 2024, 10:27pm

Hi guys,

Let’s say I have a table with >1000 rows, and there is a specific column X where multiple repeated data points can be present.

What I need is that if there is any data in column X that has been repeated three times or more, I would like to obtain the data for the entire row in a new table.

Could you help me with any suggested workflows or nodes? or even a Python script that I can integrate into KNIME?

ArjenEX · June 20, 2024, 10:47pm

Hi @ihisawi

Which “entire row” are you referring to? Is that the first, the third, all of them, etc.?

It makes quite a difference for a potential solution. A workable example with (anonymized) input and expected output would help a lot

rfeigel · June 21, 2024, 1:57am

Assuming you want to save all rows that have => 3 identical values in the selected column, try this:

You can filter the columns in the join that aren’t from the original table. I left them in this example for QA.

ihisawi · June 21, 2024, 12:15pm

This is the perfect solution for what I wanted.
Thank you so much!

rfeigel · June 21, 2024, 2:14pm

You"re welcome. Glad I could help.

system · June 28, 2024, 2:15pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.