DBSCAN multi threaded (?)

Hi All,

I’m trying to do outlier detection via DBSCAN, but i notice that it is very slow, and not optimised for multi-threading, some googling around, i’m not sure if it is really easy to make it multi-threaded…

are you guys having it in the roadmap somewhere, or is there an interesting workaround in doing this?

the nice thing that i like about DBSCAN is that it kind-of comes up with a most optimal nr of clusters, i’m not very fond of the elbow method to do the hyper-param optimisation, as i don’t have direct UI feedback and need to run them in batch…

would really appreciate your input on this!



Hi @hermyknime,

Have you benchmarked DBSCAN (3.7) from Weka on your dataset? You shouldn’t get your hope up, though, since the documentation states:

Basic implementation of DBSCAN clustering algorithm that should not be used as a reference for runtime benchmarks: more sophisticated implementations exist! Clustering of new instances is not supported

We currently don’t have an optimization of our DBSCAN implementation on the agenda…


