Trade area - Zip Clustering

I pulled a quick zip list for California for this example/test which contains zip and lat/long. The goal is create ‘trade areas’ or clusters of contiguous zips within a 20 mile radius. I want to maximize zip densities within the 20 mile radius. (example, Zip A is 15 miles from Zip B, Zip B is 15 miles from Zip C, Zip A is 30 miles from Zip C. Therefore, Zip A and C shouldn’t be in same cluster/trade A since they > 20 miles. And Zip B could go into Zip A or Zip C. That decision should be based on which of those zips are in a higher density grouping, and if no other zips exist, then it goes to the closest one (if exactly the same), then it can randomly be assigned to either.

I did a cross join and used the distance calculation used for zips distance to get distances across every single zip code, used haversine, used DBScan (eps 20), but I can’t seem to get it right, clusters are exceeding 20 miles (example: notice the blue in northern california. All contiguous zips seem to be within 20 miles, but the non-contiguous exceed 20 and should be in different clusters/trade areas). Does that make sense? The below is a visual I pushed into Tableau. Using the orange as an example, that is assigned to one cluster, and it exceeds 20 miles, and there’s a cluster in between the bottom and middle (near SF in central valley and up the northern coast.

I’ve attached the excel file with the zips and lat/long. I’m struggling with figuring out how to create this clustering, gravity model, trade area (I’m not 100% sure the right naming convention).

Any insight would be greatly appreciated!


[Zip Cluster Test Extract_Extract.xlsx|attachment]
Zip Cluster Test Extract_Extract.xlsx (81.2 KB)

The ultimate goal will be to have a list of potential customers and identify the highest concetration of potential customers to open a location or add service areas, so I would want to focus on areas that have the highest opportunity for success (concentration of potential customers), and avoide areas that have the lowest conctration. This data set is just to test (only 1 per zip).