Buggy distance threshold limit in the Hierarchical Cluster Assigner?

Hi @gcincilla

You did well insisting. You are right and I see know why this didn’t make sense to you. I was biased by the math theory but the way this has been implemented in KNIME goes beyond the maths logic lol :sweat_smile:

Renormalizing is not enough given that the distance matrix cannot be renormalized. This makes incompatible the -Numeric Distance- node and the -Hierarchical Cluster Assigner (local)- nodes.

Since renormalization cannot be applied to the distance matrix too before using it, I searched for other solutions and fortunately, KNIME has a second node called -Hierarchical Cluster Assigner- (not the local one) which does not impose a maximum threshold when using the Cosine Distance Matrix (bound within the [0…2] range by the -Numeric Distance- node). So this -Hierarchical Cluster Assigner- node should solve the problem.

I have modified your workflow to integrate it. Please have a look at it and let me know if this is fine with you :wink:

The resulting optimal threshold is 0.36 and this is what is displayed in the output window of the new -Hierarchical Cluster Assigner- node as shown here below:

From this image, you can see that the Hierarchical tree spans from 0 to 2 distances and the threshold is optimally calculated within this [0…2] range.

hier_clust_min_distance_problem_normalized_fixed.knwf (330.5 KB)

Please let me know what you think about this solution. Hope it helps.

Best
Ael

3 Likes