When reading the paper for this Node, the input should be lists of variable lengths. In Knime these lists need to be split into columns for processing by the GSP node, which means some of them have missing values, which the Node won’t accept. How can I turn a missing value into an empty nominal?
I know Knime doesn’t have Nominal as a primitive type, but there is the Domain Calculator node to convert string to nominal… is there a way of making it handle empty values?
Thanks everyone. The problem is not creating missing values, the problem is converting the existing missing values into a format that the GeneralizedSequentialPatterns (GSP) Weka node will interpret as “empty nominal”.
I have not managed a find a way to achieve this yet, which is frustrating, as the Node is supposed to accept information of varying lengths, which is going to mean some columns have missing values.
For now I have just turned a missing values into a string “-”, but it means they get processed by the Node rather than ignored.
It could help if you could provide us with an example with data and the Weka Node and a hint at exactly which point the problem with the missing values exists. Then it may be easier too see what can be done about it.
Thanks, here’s an example workflow. The first column of the table has to be an ID for the GSP algorithm. This column and the sequence columns need to be nominal, so I used the data calculator node to convert them for strings to nominals. However, it seems that empty nominal values are not possible (so i have used “-” here instead). but then this means they get processed by the GSP node instead of being ignored.
Instead of “-” in Missing Value node just leave empty and you will get empty string which should be ok for your node.
Domain Calculator node scans the data and updates the possible values list and/or the min- and max values of selected columns. It does not convert string column to be nominal data type.