Duplicate row filter on checking the substrings as well

Hi All,

I have a column of single words that I would like to filter a bit. It is like a duplication search but I am not searching for only the exact matches but I would like to filter the substring matches as well.

What I mean is for example I have the words “accomplish”, “accomplished”, “accomplishment” and from these ones I need only the “root” word, so the “accomplish”, all the others must be removed from my list.

Could you please help me how should I do this?

Thank you for your help in advance!

Hi @Cameleao , have you tried adapting this workflow: Stanford Lemmatizer Example – KNIME Hub ?

The workflow will need some adjustment since its goal is to standardize than to remove, but the core of what you need is there.

5 Likes

Thank you @badger101 , this workflow helped me a lot! :slight_smile:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.