i have one column - names of compounds (which i want to use as reference for the second table) ,and a table with 3 columns= compounds name, smiles, IC50. and i want to create a new table with the 3 columns which have only the molecules that match the molecules from the first table. i tried to use"Joiner" and "Reference Row Filter" but both of them give an empty table:
WARN Joiner Node created an empty data table.
WARN Reference Row Filter Node created an empty data table.
it is the same names, because the file with the one column is taken from the original file (with 3 columns), after a filteration with tanimoto similarity matrix calculation script that i wrote!
i tried in excel to look for one molecule (from the filtered file) in the original one, and it dosn't match!! but if i look only for the number in the name, it find matches, so what's the problem here ??
Why is your canonical smiles column an SDF string? I coudn't read this CSV file correctly but excel could still display the name column.
As aborg said you have a whitespace problem. In your filtered_molecules.csv the values in the column all have a trailing white space character. The following strings: "CHEMBL1707797" and "CHEMBL1707797 " are not the same as the second end in a space and thus you can't join these columns. You will need to remove the whitespace from the filtered_molecules.csv to get it to work.