Please allow me to caveat this topic post by saying I know that Python exists, there are many things I would like to do with Python in Knime (especially now that most of the projects I am working on include a big geospatial component), but my Python expertise ends there.
I have a python script that works in Jupyter Notebook to do a spatial join between 2 shapefiles, and I think that I have most of it working in the Knime Python script node but keep getting this error:
ERROR Python Script 3:2 Execute failed: No serializer extension having the id or processing python type “shapely.geometry.point.Point” could be found. Unsupported column type in column: “geometry”, column type: “<class ‘shapely.geometry.point.Point’>”.
When I execute the script in the node in the configuration dialog I get no errors, I think because it does not need to create the output table, but as soon as I run it as part of the workflow I get the error. Could be when the node is creating the data frame for the output table? It works in Jupyter Notebook!
I would really appreciate any help getting this to work. It will save hours going backward and forwards between Knime and QGIS doing spatial joins.
@TigerCole I think KNIME would not have a data type within KNIME. If you do not need the geo data *within’ KNIME would it be an option to just store the file (maybe save the path/name) and then reuse it later in a Python node.
I will give the suggestion to write the data frame out to a file from the node and then read it back into the workflow a try.
The geodata is WKT like “Point (12.1233,34.1234)” which I use often in Knime and can be written out as text so I don’t see why creating a data frame fails Iif that is indeed the problem) so it seems strange that the data cannot be put into a python data frame and used as the node output.
Maybe KNIME 4.6 which is due out tomorrow will handle this better.
It seems Knime minds think alike… that is exactly what I just did.
The problem was the column with dtype=geometry, so I converted it to a string and then could output the data frame.
Here is the updated python:
import pandas as pd # Need this for df
import geopandas as gpd # Need this for the spatial join
# Point shapefile filepath
traders_fp = "D:/Users/Russell/Projects/Trader Lists/2022-06/Test/Traders_2022-06_PyTest.shp"
# Read point shapefile and set CRS
traders = gpd.read_file(traders_fp)
traders = traders.set_crs('epsg:4326', allow_override=True)
# Areas shapefile filepath
areas_fp = "D:/Users/Russell/Projects/Areas_20220608/Areas_20220608.shp"
# Read areas shapefile and set CRS
areas = gpd.read_file(areas_fp)
areas = areas.set_crs('epsg:4326', allow_override=True)
# Spatial join
# how="inner", ...
join = gpd.sjoin(traders, areas, how="inner", predicate="intersects")
# Create Dataframe
dfjoin = pd.DataFrame(join)
dfjoin['geometry'] = dfjoin['geometry'].astype(str)
#Output to dataframe
output_table_1 = pd.DataFrame(dfjoin)
As always, thank you for your comments and suggestions. They are really appreciated.
Thanks for trying the Python Scripting nodes with Geospatial data! And thanks @mlauber71 for the good suggestions !
You are right, we currently do not support the shapely data type (or WKT in general) in KNIME. So right now the workaround is to convert this data to string, so that when we parse the Pandas DataFrame and convert it to a KNIME table we can use a type known to KNIME.
That being said, just a brief teaser here: there is a geospatial extension in the making, support will be coming to KNIME and Python in KNIME soon