clustering and Anomaly Detection with qualitative variables

It is possible to do an Anomaly detection su Knime?. I Explain my problem. I have only qualitative variables and I want to do an anomaly detection (with knn alghoritm) to find outlier in the dataset.
In rapidMiner it is possible to use a clustering with qualitative variable using the operator X-Means with a Nominal Measure and then using the Knn alghoritm to find outlier in each group. How might I do with Knime?


Hello Lord_Enzo,

the short answer: Yes, definitely.
The long answer: It depends on how you want to do it.

Let’s consider the workflow you described:
For qualitative/categorical data, the important question is how you define distances.
I would suggest to use the One to Many node combined with a Create Bit Vector node to get a vector for each row. Once your data is in this format you can use the Distance Matrix Calculate node to get a distance matrix which you can then use in the k-Medoids node, which I guess is equivalent to X-Means in rapidMiner.
Once you have your classes you can use the K Nearest Neighbor (Distance Function) node to find your outliers.

KNIME also offers other algorithms that could be useful for your use case (e.g. DBSCAN or MDS).

Please note that you will have to install the KNIME Distance Matrix extension to access all the nodes (but don’t worry it’s free).



1 Like