Separating data into multiple files

Angus · November 5, 2012, 8:43pm

Hi all,

I have downloaded some data from Chembl for a particular biological target and would now like to group and separate out the data by publication and create a new file (CSV or PDF) for each publication, giving the structures, biological data etc. I'm guessing I need to create some kind of loop but not really sure where to start.

Many thanks,

A

gabriel · November 6, 2012, 8:44am

Indeed, you will have to create a loop starting with a Group Loop Start node where you select the publication column. This node will return one data chunk (based on the publication) at the time, which you can write out or process with any other KNIME node. A loop always need to be ended by Loop End or any other loop end node. In tis case, I would suggest simply to connect the Group Loop Start with the Loop End and have the Writer in a separate branch.

Angus · November 6, 2012, 1:27pm

Hi Gabriel,

Many thanks for your quick reply. How would I go about naming each file with a unique name, eg the doc_id, so I end up with a separate file for each chunk.

Sorry if these are very simple questions!

A

Aaron_Hart · November 6, 2012, 1:57pm

Hi Angus,

You will need to drive the file writer using a flow variable. The most likely progression would be a java edit variable node attached to the group loop start followed by a csv writer and a loop end node. Attached is an example of what this might look like.

Regards,

Aaron

Angus · November 6, 2012, 5:12pm

Thanks Aaron,

worked perfectly.

A