How to better split collection columns to use for machine learning?

Hi, I tried the “Split Collection Column” node, but it looks like it only splits one column at a time and does not retain the original name. Is there a way to split a selection of collection columns, keep the original name of each and maybe append a number to the original names after splitting?

I am trying to figure out how I can use the many collection columns that are an output of the cdk “molecular properties” node as inputs for machine learning.

Hi @tnad,

you could use a KNIME loop for that, i.e. the Column List Loop Start Node. Then within the loop, split the column and use a column rename node (probably with a flow variables) to change the name of the split columns. At last, collect the loop iterations with a Loop End node.

Cheers,
David

2 Likes

Hi there @tnad,

welcome to KNIME Community!

Unfortunately there is no node that can split multiple collection columns at once. At least to my knowledge. And I agree, names produced by Split Collection Column are a bit - non related to original column name :smiley: . But as usual there is a nice workaround using loop, flow variable and regex - as suggested by @DaveK.

Here is example on KNIME Hub which you can download and try it out:

If any questions/comments feel free to ask.

Br,
Ivan

3 Likes

Thank you so much. This is exactly what I’m looking for. I wasn’t sure this is possible in KNIME. It helps me understand how much more I can do with loops too.

2 Likes

Hi there @tnad,

glad it helped :wink:

Br,
Ivan

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.