Epsilon value for geo coordinates

JWebb · July 11, 2023, 1:32pm

I have a workflow which tries to identify groups of people living close to each other. I have their lat/long coordinates and I use the palladian lat/long to coordinate and geo distances nodes before feeding into DBSCAN. The trouble is, I do not know how exactly this is clustering in terms of distance! Epsilon is currently set at 2.0 because I plucked that number out of the air and it worked for what I was looking at the time, but I would really like to be able to identify people who live within, say 1km of each other. Can anyone suggest how I can work it to cluster those together by distance? Thanks!

qqilihq · July 11, 2023, 3:06pm

For this, I’d rather suggest to use a different clustering mechanism which allows to define a real distance threshold instead of the DBSCAN’s epsilon. I guess you could use the following nodes for that:

Caveat - probably not as memory efficient as a DBSCAN, which is afair quite well-suited for huge datasets?!

Anyways … Good luck!

JWebb · July 11, 2023, 3:28pm

Do you know how I can assign the 1km distance with that? I tried distance threshold of “1” (whatever that means!) and they were all cluster 0. Changing that to 0.5 has clustered people together who live much further away. So I am not really sure what this distance threshold means.

qqilihq · July 11, 2023, 3:42pm

Haven’t done it myself. The documentation says that it’s normalized to the maximum distance - so you can probably work backwards from that to determine the desired threshold.

JWebb · July 11, 2023, 4:27pm

Aha, normalisation. That explains it! That could complicate things as the maximum distances could well change. I will have to have a think about how best to do it. Thanks

qqilihq · July 12, 2023, 6:10am

I would assume that you can “automate” that using the flow variables? I.e. determine the maximum distance for your current dataset, based on that, calculate the desired threshold, and feed that value as variable input into the Hierarchical Cluster Assigner.

Hope that helps!

-Philipp

system · October 10, 2023, 6:11am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.