Collection column without Null values

Hi,

I try to use "Create Collection Column" node for "Item Set Finder". Some of cells are empty (Null value), so in collection includes "?" value. Can I any way receive sets without "?" value?

Best regards,

Max

Hi Max,

use the pivot node, to rotate all of your "Set" columns into one column, keeping all the other columns

than use the row filter on the new pivoted column, filtering only missings.

now group by all the other columns and aggregate the set column.

 

But you're completely right, the "Create Collection column" should get a "Skip missing cells" Option.

Hope that helps,

Iris

Thank you, Iris

Hello Iris and Max.

I have the same problem. However, I fail to apply the suggested solution for use with the Item Set Finder. Maybe I have not understood it or my situation is different. I would appreciate any help to get rid of the missing values to get a better result with the Item Set Finder.

As far as I have understood the Item Set Finder node requires a collection column. This type of column seems to _only_ be created by the Create Collection Column Node. The Group By node creates sets, lists and concatenated strings, but all those seem not to be recognized by the Item Set Finder as a collection type. The Create Collection node again requires multiple columns as an input. So for item sets of variable length the problem remains: There are either missing values or empty cells in the collection.

I have a column with a list of items in every row, a concatenated string, separated by commas. Note that the number of items in the list is variable. I then use the cell splitter node to split this column into many columns. The number of added columns equals the maximum number of items. Then I use the create collection column node to create the collection column which is recognized by the Item Set Finder but contains a lot of missing values or empty cells (for example if the list is "A,B,C" and the maximum length for lists in the whole table is 6 then the  resulting collection would be [A,B,C,?,?,?] when the cell splitter creates missing values or [A,B,C,,,] when it creates empty cells). I would expect the collection to be [A,B,C].

I don't see how the pivoting or unpivoting could help here as the group by node seems not to produce any output which can be recognized by the Item Set Finder and the Create Collection Column always needs multiple columns. Please let me know if you have any more hints for me or if I missed something. I would also appreciate if the Create Collection Column node had an option to skip missing values.

Thanks for your help and best regards, Stephan

You could use a Java Snippet node with an array return type, which will then be represented as a KNIME collection cell, the expression could be:

String[] split = $string to split$.split(",");
return split;

 Bernd

Hi Stephan,

 

a collection is the umbrella term for sets and lists.

The create collection typically creates lists, or (if you check the box) sets.

Best, Iris