Tika Parser - knime 4.5.0 problem

Hi,

After the update to knime 4.5.0 the Tika Parser stopped working.

Now when I want to read something it says: “Node create empty data tables on all out-ports”

It doesn’t read any formats.

In older version 4.4.0 everything seems fine.

I even tried installing knime in different folder and download new extensions of textprocessing.

Does anyone knows whats going On?

Could you give some sample data that you feed the parser and your workflow so we can reproduce the error? I’m also using the Tika Parser in 4.5, but I’m not running into problems so this may be a data or workflow-specific problem.

1 Like

#victor_palacious
*victor_palacious

After your message I tried some things and what I have noticed that when I selected dictionary on my computer so C:\ it works but when I leave it as it was before shared drive so \Win- it doesn’t.

Funny thing is that the same rule but in Knime 4.4.0 and it works without diffence if it is local or shared drive.

Does this information helps you or indocate what seems to be the problem?

1 Like

Hi 89trunks,

Thanks for the details. Whether you are reading from a local or a shared drive may actually make a difference here.

Off the top of my head, one thing you may try for a temporary workaround is to enter two leading backslashes instead, i.e. \\Win.

In any case, we are always aiming to ensure backwards compatibility, so workflows from previous versions will work without any manual changes.

I am currently looking into this issue and I’ll follow up on this thread when I know more.

It would be helpful if you could provide us with the problematic workflow.

Best,
Ben

1 Like

I’ll need more information from you in order to be able to reproduce this.

Here’s what I observed when trying to reproduce your issue:

  • For “Directory Settings > Selected Directory”, a UNC path (shared drive) with leading double backslash works, for example \\HOST\path\to\data.

  • In both 4.4 and 4.5, a single leading backslash, i.e. \HOST\path\to\data does not work and prints an error message in the dialog and upon node execution.

So, there is no apparent difference between 4.4 and 4.5 there. This means we need to further pinpoint what is causing your issue.

Can you provide the problematic workflow, or a screenshot of node’s configuration?

1 Like

@BenjaminMoser






Provoding workflow will be difficult - company Policy.

Hi 89trunks,

thank you for providing the screenshots.

I was now able to reproduce your issue. It is connected to encoding of “special” characters in file paths. Concretely, the issue is caused by the space characters in the file paths.

As a temporary workaround, you may rename the source directory such that the file path does not contain spaces. If you cannot do such a rename, it may be interesting to create a symbolic link that does not contain spaces; however I did not investigate this yet and thus cannot guarantee it will work with UNC shares.

In any case, I will be looking into this and let you know when I have further information.

Best,
Ben

Hi,

a bugfix is currently awaiting review.

As for a temporary workaround, you may try creating a symbolic link that does not contain spaces, as described in this post. However, I can’t guarantee that this will work in all cases.

Best,
Ben

3 Likes