Show duplicates within a column exclude blanks

Darius · November 16, 2020, 7:16am

Hi,

i have a table with 10 columns, want to see the duplicates within first 3 columns, say column 1, 2 and 3.

So show duplicates within : column 1 and column 2 and column 3 (dont include blank rows as duplicates)
I could use duplicate filter (keep duplicates) 3 times
But the issue is the node includes blanks which i dont want to be included.

Any idea how to do this?

Daniel_Weikert · November 16, 2020, 5:56pm

maybe not the best option but you can first split the table into one with and one without nulls then do your dublicate check and concatenate them back

Darius · November 17, 2020, 8:48am

wouldnt work, because if I split blanks it will remove the whole rows, in one column it may be blank but in another column its not blank and I will need that cell.

Darius · November 18, 2020, 9:04am

any idea on this?

tommy · November 18, 2020, 9:33am

hi @Darius,

I would loop over these columns and perform the filter (blanks) and the duplicate filter (keep duplicates) within the loop.
Please have a look at the example workflow from @christian.dietz…

That should serve as a template for you. Greetz, Tommy

mlauber71 · January 24, 2021, 2:16pm

@Darius you could first mark the three columns in question if they have a MISSING value with the Rule Engine. Then exclude these rows that have all columns missing. You could then replace the missings with another value with the missing values node.

Then you might concatenate the three columns into one (new) column. So if one row is unique in all three columns the new column would be identical. And then deal with the remaining duplicates.

You might want to make sure you have a unique RowID or something in oder to be able to bring back your results to the original data (and deal with the remaining columns you want to keep).

system · July 26, 2021, 2:17am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.