KNIME Collection Columns as features for a machine learning model?

I am wondering if there is a machine learning model in KNIME that can use one or more Collection Columns as input? The columns might have a different set of features within them so just splitting them back into columns will not be a good option - though I am open to suggestions for some automatic conversion that would make sense to a model :slight_smile:

The model should still cosider the right datatype right , so you can’t convert them to string. Is the idea something similar to the Vectorassembler in pyspark? Any particular reason you want collections as input?

1 Like

@Daniel_Weikert the scenario could be a series of events occurring (or not occurring) in a specific time frame (which is the column). Other option would be to do something with time series analysis. But I wanted to explore this specific setting. There are models to use Collection columns to find rules but they would only accept one such column.

But thank you for the hint with the Vectorassembler. I might explore that further which might also have the benefit of running on a big data cluster.

1 Like

Would be interested in your progress in this regard. Sound interesting

Hi @mlauber71,

I’m not aware of a supervised learning algorithm / learner node that can handle collection cells as inputs. I’m trying to think of a use case where this would be helpful and how I would expect the algorithm to handle this situation.

Can you please tell me a little bit more about the use case you have in mind?
Is it a supervised classification problem you are trying to solve?
Do all cells in a collection column have the same amount of features, e.g Collection-1 has always 3 values, collection-2 has always 3 values and collection-3 has always 5 values?


1 Like