R Source (Table) - failure trying to loop through accentuated filenames in Windows 10

Hello everyone,

I did some tests using the nodes List/Files Folders, Path to String, Table Row To Variable Loop Start and R Source Table (KNIME version 4.3.4) and the following snippet doesn’t work if the path contains any accentued characters, being the path from a local drive or from a network drive (a mapped share from a Linux’ Samba server, to be more specific about my use case).

# The foreign library provides access to many 3rd party data formats.
# Just a few examples are listed below, many others exist. 
# More details cran.r-project.org/web/packages/foreign/foreign.pdf 
library(foreign)

# map filepath from a flow variable here.
path = knime.flow.in[["Path_String"]]

# Read
data = read.dbf(path)

knime.out <- data

KNIME Console outputs Execute failed: Error in R code: Error: unable to open DBF file, indicating a somewhat invalid path.

I say “somewhat invalid”, because if I manually declare the path inside the node, it just works, even with the accents, so I’m assuming that my GNU R (version 4.1.0) setup works fine, also strangely enough, if I compare a manually declared path with a automatically assigned path from a workflow variable (evaluating using something like path == pathmanual), the output indicates that both are identical.

If both path were identical (they really are, but maybe a encoding problem is happening in the background and I can’t see it), the node should’ve run just fine. I’m puzzled and any help will be very appreciated.

Hey @caiocco,

I haven’t written anything in R in quite some time (admittedly a Python fan - personal opinion), but when I run into frustrating path issues what I typically do is write a little script to list the contents of the directory I want to read from.

The issue often surfaces that way because either a) my script is reading from the wrong directory or b) the paths I need to provide are slightly different than what I expect, and the output from my test script will likely list off the files with the correct path format.

Cheers,

@sjporter

1 Like

Hi, @sjporter!

I came up with the following workaround:

# The foreign library provides access to many 3rd party data formats.
# Just a few examples are listed below, many others exist. 
# More details cran.r-project.org/web/packages/foreign/foreign.pdf 
library(foreign)

# map filepath from a flow variable here.
path = knime.flow.in[["CaminhoArquivo"]]

# workaround hack (needed for Windows 10)
Encoding(path) <- "UTF-8"
path_latin1 <- iconv(path, "UTF-8", "latin1")

# read table
data = read.dbf(path_latin1)

knime.out <- data

So, I as suspected, Windows 10 doesn’t appears to support UTF-8 encoded paths. I converted them to Latin-1 (ISO 8859-1, to be compatible with Windows-1252 native encoding). I think it’s a ugly hack.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.