Extracting columns from different csv files and appending them to an sdf file.

Dear Knime users,

I want to extract few property/descriptor columns from some 10-different csv files and append these columns into one sdf file having the corresponding 2D structures. The sdf and csv files are having same number of rows. Please suggest the useful nodes to create the workflow for the same.

CSV / File Reader to read in the tags. 

SD Reader to read in the SDF file

Joiner to combine the columns from the two tables (you will need a common ID between the CSV and SDF)

SD Writer to save your new SDF. 


Thanks swebb. The workflow is working fine for one csv file. How can we introduce a type of loop so that it takes other csv files also one by one and append their required columns into the final sdf file?



If you CSV files are all in the same folder you can use the list files node, keep only the CSV files. You will then have a table with filenames / URLs for the CSV files. 

There are multiple ways to do it but one would be to then use the 'Table Row To Variable Loop Start' node and connect the output with the flow variable input of the CSV reader. You would want to use the column append loop end. 

You now have the table with all the columns you want to append to the SDF and plug this into your joiner. 



Dear Sam,

I tried as you suggested but the output contains the columns from only one csv file. It did not append the columns from the other csv files. Looks like I am missing some step or some settings needs to be changed. 



Is it updating the filepath from the variable each iteration of the loop?

Sorry Sam that I dont know exactly. However it is not showing any error, but only appending columns from one csv file only. In the csv reader node, whatever one of the files we are pointing at, that file only it is considering and thus appending the columns from this specific file only. 



In the node you need to control the location using the flow variable (otherwise it won't know to change to a new file each iteration of the loop). 

I've attached an image with an example of the note setup and how to control the location with a variable (click the button next to browse and you get a new popup). 

Thanks Sam for clarifying the things. The workflow is working fine now.



Swebb I am trying to joiner the CSV and SDF files from the pubchem’s assays, but I am with dificult to find the common ID’s between them. Can you help me with this??

Do two data files have common ID by which they could be joined? If not maybe Column Appender is sufficient for your use case?