GIS Clustering and Shapefile Creation

Using just three of many datasets for my example, these latitude longitude points from three files represent three different price-points. I am going to combine the three datasets and wish to cluster them geographically–within some physical distance tolerance (?DBScan, K Nearest Neighbor, whatever you suggest). I have attached the Latitude and Longitude points and a base ID for each of the three groups here.

I would love the GIS distance algorithm to automatically create/present the shapefiles (clusters) and assign the shapefile ID to each property, both as data and as a map image.

From there I get to do all kinds of cool analyses.

As always, any assistance, suggestions, or workflow you can offer is more appreciated than you know.

group1.xlsx (8.6 KB)
group2.xlsx (10.9 KB)
group3.xlsx (11.6 KB)



Hi @smithcreed -

Sorry for the delayed reply here. I’m a little confused about what kind of clustering you expect to do if all you have is lat/lon and a group ID, since these points all overlap in the same region.

If you cluster just based on what fields you have, I don’t think a clustering algorithm is going to be able to tell you much. Do you have other fields to include in the analysis?

(Maybe this would work OK if you have lots of little clusters?)

Any additional detail about what you want to do here would be helpful. Have you already tried concatenating the datasets and doing a simple k-means, just to see what comes out?

Hi Scott, I have a grotesque amount of data for each property associated with each lat-long point. I have used and will use further several data clustering algorithms including K-Means and likely Fuzzy C-Means. But for this part of the project I am focusing on actual physical location clustering and analyses.

I have already used K-Means to cluster the properties (actual houses) by criteria such as Price, Size, Etc. I created sub-groups from these clusters (similar properties based on monitored criteria) and have now clustered each sub-group geographically using DBScan.

At this point I used DBScan and tweaked the adjustments until the clusters look appropriate/optimized geographically.

Using the lat-long points associated with the DBScan derived geographic sub-group clusters, I would love to understand how to create fully enclosed shapefiles or at least the centroid of each series of lat-long clusters from DBScan. I don’t know if KNIME can do this or if I need to use outside GIS software to do this.

Yes, I could now hand draw the shapefiles inferred through DBScan using google tools, import the WKT, and derive the centroid–but hand-drawing a thousand shapefiles sounds kind of sucky. I was hoping for an automated/magic solution. :laughing:

Thanks

1 Like

Hi @smithcreed -

This one fell off my radar completely, sorry! Maybe this forum thread would be useful to you - it contains a nice workflow from @LukasS as well that makes use of the KNIME Spatial Processing Nodes extension.

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.