row filter

I have 2 tables and each has one column, I want to filter out rows in the first table in which column value contains any value in column of the second table.
Example:
Table 1
Rowid Column
1 this is the first row
2 this is the second row
3 this is the third row
ā€¦
n this is the nth row

Table 2
Rowid column
1 second
2 fifth

my desired result will be without row 2, 5.
any suggestions to handle this? Thanks in advance!

Hi @Hawk326040

Thanks for your question!

Did you try the joiner node?

Here are also some videos explaining the node configuration:
https://www.youtube.com/results?search_query=KNIME+joiner+node

Best,
Martyna

1 Like

Hi there @Hawk326040,

think you example with rows can be a bit confusing. Nevertheless here you can check workflow that should do the trick if I got you right:

https://kni.me/w/DhVHz4LKxY-KgkK8

Similar issue (all in one column) was discussed here if you want to try different approach: Remove rows that are contained (as sub-strings) in other rows

Br,
Ivan

1 Like

@Hawk326040,

I had a similar requirement few months back. I accomplished it using Knime, but it was extremely slow when working with a large dataset. So I implemented this in R instead and used it in Knime. Here is how:
https://sites.psu.edu/saqib/2019/08/02/calculating-the-longest-ngrams-from-set-of-ngrams/

Saqib

Hi Martyna, Ivan & Saqib,
Thanks for sharing your experience and useful workflow. Based on your workflow, I created a new workflow and its executed time is acceptable when running with a large dataset.
Here is my workflow in KNIME Hub:
https://kni.me/w/I4KGrafgcXTgpb7E

2 Likes

Hi,

good to hear you found a solution for your issue and especially that you share it on the Hub!
Very cool!

Best,
Martyna

Nice one @Hawk326040!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.