Joining two tables based on partial similarity of name or comma delimited names

Hello all,

I usually use join node for joining two tables based on shared columns. My problem is with tables that their shared columns are not exactly matche (differences in capital charecter or existence of some charecters like comma,space and ...). How can I handle these kind of tables?

Example:

table1:

Genes ID
gene1 5462
gene1 4367

table2

Genes GO
Gene1, Enzyme2 go1

Favour table:

Genes ID GO
gene1 5462 go1

 

any answer will be appreciated

 

Hi Ahmadiut,

You can use the the String Manipulation node and the Cell Splitter node before the Joiner node to transform the strings. Please find attached an example workflow.

Hope that helps,

Cheers,

Vincenzo

Ahmadiut,

you can try String Distance, Similarity Search and Joiner nodes if you join only by one field.

See also KNIME example on Unduplicate names.