Comparing all values of a column to all from another


Is there a way to compare two columns from different datasets by comparing each row to all others and not just one row against one?

Thank you for your help.



you can use a cross join to create a table with n x m combinations:

– Philipp


Hi there @Shaller,

welcome to KNIME Community!

Sure there are ways but can you tell us a bit more like what are you comparing and what is goal of this comparison?


You are right. More information will make it easier to help solve my problem.

I have a input file that contains the following data structure and examples

Category | specific| category | specific
Fruit | apple | fruit | orange
Vegetable | carrot | fruit | apple
vegetable | carrot | vegetable | peas

The other table contains information such as:

Name | Category | specific | group
Peter | Fruit |apples | group1
Peter | vegetables | peas | group1
Sarah |Fruit | apples | group1
Sarah | vegetables | peas | group1
Tom | fruit | orange | group2
Tom | vegetables | carrots | group3
Peter | vegetables | carrots | group3

The children are grouped according to the food they eat. I want to take my first list and comapre it to the second. e.g check if a group of kids like apples, does it also like oranges?

There is a small hickup in my data. In the category group it says: fruits, apple etc.
However this could always be fixed with a split of some kind.

In my opinion there are two ways to go about it:

First I have to loop over all rows per group -> potentially split groups into separate tables. This process will have to be repeated for many times.

Alternatively I could convert every group into a row, which contains lists of names and lists of fruit and vegetables. However I haven’t figured out how to do this in Knime.

I just struggle to find the correct nodes and more importantly node sequence.

I want non “programmers” to be able to used and edit my workflow, therefore I want to try and use as little Code as possible (snippets).

Thank you for your help.

1 Like

Thank you for your suggestion.
This approach works and I used it, however I fear that with a larger data Set it may become very time intensive. Therefore I am still open to alternative ideas.

1 Like

Hi there @Shaller,

cross joining will take time for sure on larger data set. You can try streaming functionality to speed it up:

To convert every group into row you can use GroupBy node with appropriate aggregation method.

Mind sharing workflow example with approach that works? Can check it.


1 Like

I am currently using the crossjoin as it was the easiest solution.
However you are right Ivan, that it will become slow with large tables (which I do have).
Therefor I will look into your suggestion.

Thank you for your help everyone.


1 Like

Unfortunately I can not share my workflow and data, as its confidential.

Hi @Shaller,

to check the concept dummy data in workflow example is good enough :wink:


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.