i use Fuzzy C-Means on a training record and have selected “Induce Noise Cluster”. Nevertheless, the PMML model does not contain a description of the noise cluster. How can I use the NoiseCluster in the Cluster Assigner?
for the output format of the Fuzzy c-means we use the Predictive Model Markup Language (PMML). This unfortunately does not seem to support noise clusters, as they cannot be represented by fixed coordinates. Would it help you if I gave you a workflow to extract the cluster centers from the produced model? You could then calculate the distances yourself using our various Distance nodes and assign clusters accordingly.
for you, @Kalle_Knime, and anyone else looking for a way to extract clusters from PMML, I have uploaded a workflow to the KNIME hub.
Thank you for your prompt reply. We are particularly interested in the noise cluster, as it is an indicator of outliers. The other clusters are not interesting to us. We would like to apply the learned outliers on a new test dataset. There is no possibility to use the noisecluster membership in knot “cluster assigner.”?
no, there isn’t as it cannot be represented in PMML. However, as far as I understand the data points assigned to the noise cluster are just those that have a distance above a certain threshold to every other cluster. So if you use the workflow I provided, you can find the outliers by checking the distance to the closest cluster (which is already calculated in the example). Unfortunately I don’t know the specific criteria for assignment to the noise cluster, but maybe this already helps?
This is a good hint. I will try it. Thank you for your help.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.