I am trying to find the optimum value of the silhouette coefficient with random starting centroids in k-means clustering. The idea is to explore the stability of the number of clusters with different centroids. As soon as I add the outer Counting Loop, the k-means node throws an error. Can this not be done?
Frank
I downloaded your workflow and tried it out. It seems there is a conflict between workflow variables with identical names being introduced by the two loop start nodes - they both want to create a flow variable called Loop (0).
After playing around a bit, I was able to get it to work by adding a flow variable connector between the Normalizer and Parameter Optimization Loop Start nodes, like this:
This forces the second loop start node to create a flow variable called Loop (1), since the Loop (0) variable already exists on its branch. And then there’s no conflict for it to complain about. A bit hacky, but it works.
This is maybe more motivation for us to have a “Variable Filter” node or similar, to more easily resolve these types of conflicts in more complicated workflow arrangements
Scott,
Thanks so much for your quick response. The analysis now runs perfectly.
The purpose of this workflow is to show the effects of different random starts when using k-means. The number of clusters reported by the Silhouette coefficient varied between 5 and 11.
Frank