Unable to convert strings to document using Strings to Document Node.

Hi all ,

I am stuck at converting strings into documents files to further preprocess my data. I am parsing pdf files using Tika parser node and after successfully parsing the file, I tried to convert the ‘content’ row (String) into document using Strings to Document Node.

I get the Serialization error as follows:


WARN String to Document 4:5 Serialization error : Document could not be serialized !


I tried various references from the related topics such as :

Unfortunately, none of the solutions seems to be working for me.
I would greatly appreciate your help.

Thanks in advance! :slight_smile:

Edit : Here are some screen shots of the workflow and error:

Any help is very much appreciated. @stelfrich

@mlauber71 I saw your response to similar questions, thought you might have solution for this.

The error message could point to an empty cell. One idea could be to filter rows that would have no data.

But still it is difficult to tell from a screenshot. The best thing would be if you could upload an example that demonstrates this very error so one could investigate (without spilling any secrets of course).

Thank you ver much for your reply.
I tried to reproduce a new workflow with the same error.
Here is the workflow : Example.knwf (11.2 KB)

This is the pdf file: https://www.talenom.fi/wp-content/uploads/2020/03/annual_report_2019.pdf

Thanks for your time and help.

OK in this special case the problem ist that you try to extract a title from the Content column and that does fail. If you do not have a title you could leave that empty

image

3 Likes

Yes, thanks for the solution. ::slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.