Google vs Windows Path Conversion

So far I haven’t found a way to prevent the String to Path node from converting a URI string into a windows path (converting forward slashes to backslashes). The Excel Reader node can only accept a path variable so how exactly does a Google Drive URI get passed to it without being converted back into a windows path, which it won’t recognize as a flow variable? I’ve tested this using a java node, a python node, and a few other ways and so far there is some kind of underlying problem that is preventing path conversion from taking place. This seems like an issue that’s been around a while. Is there a solution to it because there’s no way to pass a dynamic path to it otherwise?

Hi Kevinnay, I had the same issue but it is solvable using the file/folder variables node.

In this way you will be able to download file from google drive or sharpoint online uri.
Unfortunately there is not so much learning material on that, so you need to dig a bit in how to use it but basically the node creates the path variable that you need.

Stefano

The problem is bridging the gap from a Google drive connection to a path type to excel reader. I’ve tried java and python to convert a string path to a path flow variable type, but nothing seems to work and it all comes back to the String to Path node that only converts to a windows path. I’ve looked at the URI conversion nodes, but they don’t seem to work either. The only other solution may be to not read directly from Google drive and localize the files, but that presents a different problem if the server disallows localizing files. It’s not well-conceived at this point. Even converting to gs:// protocol still can’t be assigned as a path to a flow variable.

I’m just surprised that Knime hasn’t addressed this given how many companies are embracing Google tech. I don’t see an obvious approach with the node you suggested, but perhaps I’m missing something. Since the Google drive connector is connected to the Excel Reader you would think it should pass the path directly to it, but it doesn’t. The variable settings cannot see an incoming flow variable from the connector node, which seems like something that would logically be available.

I did figure out how that node can accept an incoming path from a Google drive connector, but that value has to be changed manually, which doesn’t solve the problem of it being dynamic.

I tested the Google drive connector with backslashes and it seemed to accept it so then I went ahead and added an outbound flow variable for the workingDirectory, which is a string, and passed it to the String to Path (variable) node which creates a path variable. But when it’s passed to the Excel Reader it’s null. So I can’t make sense out of why Google drive connector is ok with using backslashes, but Excel Reader throws an exception even though it’s reading the folder as an inbound flow variable of type path. It’s a labyrinth of confused endpoints.

image

This is perhaps one of the more convoluted things I’ve had to unravel with Knime, but I’m going to post it in case someone else runs into the same scenario.

This workflow essentially generates a dynamic folder name and concatenates that to the root path generated by the Variable Creator node. They are merged and converted to a data frame so as columns they can be passed to the String Manipulation node and concatenated into a URI formatted string.

Where this gets interesting is the path conversion. Google uses a URI format (“/”) and the String to Path variable, by default, results in a windows format (“\”). So the way to get around the default behavior is to change the file system to Mountpoint, which doesn’t change the URI format and outputs as type path. String to Path doesn’t output a path flow variable either, so it’s not possible to simply connect it directly to the Excel Reader node.

That little gem would have saved me a lot of time if it was documented somewhere. A nice feature would be the option to choose between URI and Windows output format to make the node more user-friendly.

Finally, the path has to be passed to a Table Row to Variable node so it can be passed as a incoming flow variable to the Excel Reader node, which only accepts a flow variable of type path. Then the dynamic path works like a charm.

Python (PurePath), Java (ParsedPath, PathFactory), and numerous other nodes could not solve the path conversion problem between URI and Windows, but I’d love to hear other approaches if they exist.

1 Like