Geospatial Analytics using KNIME


I have 3 questions as part of a market penetration project:

1. Building Duplicate Removal
I have a collection of various files housing latitude/longitudes of buildings. Unfortunately, many buildings are repeated (with slightly different lat and longs). I am looking for a way to:

  1. identify the duplicate buildings - assume buildings within 20m of each other are duplicates, and
  2. keep one and filter out the rest - based on the order of the files - some files are more important/accurate than others
    Any ideas of how to do this efficiently?

2. Market penetration by polygon
The second part deals with understanding market penetrations. Once the duplicates are removed (as above), I’d need to count a certain set of buildings - let’s say my retail stores, and divide it by the total retail stores within a spatial polygon (.shp file). I can then get the market share/penetration of my stores, out of all the stores in that area which sell the same thing. Are there nodes which do this type of analysis?

3. Chloropleth mapping
Lastly, I’d like to map the above using a chloropleth map, does OpenStreetMaps have this functionality, or could I export to Tableau to allow it to be interactive?




For the first issue, I guess you can use this topic to find distance between buildings.I think you can use a recursive loop to calculate distance to one building and exclude the ones closer than 20km and repeat the process until there is no duplicate left.

I can help more if I have a sample of your data.



For 1.see this stackoverflow question

Basically the most common formula used, haversine formula assumes the earth is a perfect sphere which is wrong. For better accuracy Vincentys formulae should be used. I guess it depends how accurate you need to be (does it matter to falsely include/exclude?) and I don’t have personal experience with this so I suggest to google what is better or wait for more answers.

For 3:

I know you can do that with R or python for example using plotly and you can use python/R in knime. However if you already have tableau professional and it can do that it is probably the easier way to go. Knime does have nodes to send data to tableau.



  1. I, too would be very interested in that.
  2. Did that a couple of times using Bokeh: