How to create Empty Nominal values?

Hello everyone, I am trying to use the GeneralizedSequentialPatterns (GSP) Weka node for Knime (https://nodepit.com/node/org.knime.ext.weka37.associator.WekaAssociatorNodeFactory%23GeneralizedSequentialPatterns%20(3.7))

When reading the paper for this Node, the input should be lists of variable lengths. In Knime these lists need to be split into columns for processing by the GSP node, which means some of them have missing values, which the Node won’t accept. How can I turn a missing value into an empty nominal?

I know Knime doesn’t have Nominal as a primitive type, but there is the Domain Calculator node to convert string to nominal… is there a way of making it handle empty values?

What about use


with MISSING function?

Not sure what you mean? Putting this into the rule engine does not work:
MISSING $column_name$ => EMPTY

Sorry, it could be MISSING $column_name$ => “”
or any other nominal you need.

If you want to artificially create missing values you could see if this trick could help.

Discussion also here:

Thanks everyone. The problem is not creating missing values, the problem is converting the existing missing values into a format that the GeneralizedSequentialPatterns (GSP) Weka node will interpret as “empty nominal”.

I have not managed a find a way to achieve this yet, which is frustrating, as the Node is supposed to accept information of varying lengths, which is going to mean some columns have missing values.

For now I have just turned a missing values into a string “-”, but it means they get processed by the Node rather than ignored.

It could help if you could provide us with an example with data and the Weka Node and a hint at exactly which point the problem with the missing values exists. Then it may be easier too see what can be done about it.

Thanks, here’s an example workflow. The first column of the table has to be an ID for the GSP algorithm. This column and the sequence columns need to be nominal, so I used the data calculator node to convert them for strings to nominals. However, it seems that empty nominal values are not possible (so i have used “-” here instead). but then this means they get processed by the GSP node instead of being ignored.

GSP.txt (114 Bytes)
GSPTest.knar.knwf (17.3 KB)

Hi there!

Instead of “-” in Missing Value node just leave empty and you will get empty string which should be ok for your node.

GSP

Domain Calculator node scans the data and updates the possible values list and/or the min- and max values of selected columns. It does not convert string column to be nominal data type.

Br,
Ivan

1 Like