Problems with StringsToDocument node

keyopen · April 11, 2018, 3:31pm

Hallo,
I’m new to Knime and I hope to ask a right question about text analysis.

I have an Excel file with 2 columns: a Number column and a string column.
I need to transform the string column of a file excel to a document to execute a sentiment analysis based on the other column that indexes the class or category, So in my workflow the first nodes i used are

Read Excel → Number 2sTring → StringsToDocument

Where in the StringsToNode i want to use the options full text and the category only because I don’t have title, authors etc.
When I run my workflow, the new column “Document” it’s full of “” (empty strings) and there is not the text of the string column. I use the “OpenNLP SimpleTokenizer”.

Where is my fail? The string column is Italian laguage, could it be a problem?

Best Regads

ipazin · April 12, 2018, 7:51am

Hi!

In StringsToDocument node in Configure → Options there is Title paragraph (the first one) and you have options Column, RowID and empty string for title. You have probably chosen empty string and you get one Try changing and see what you get.

You wrote

but I guess you meant in StringsToDocument node.

BR,
Ivan

kilian.thiel · April 12, 2018, 8:04am

Hi keyopen,

also important to know about the document visualization in the node’s output view is that only the title is show. If the title is empty only an empty string is shown. Don’t let that irritate you. The document text is still available even is the title is empty. You can see what exactly is stored in a document be using the Document Viewer node.

Cheers, Kilian

keyopen · April 13, 2018, 8:18am

Thank you very much for explanations, they were very usefull.

Thank you ipazzin, it should be StringsToDocument.

keyopen

system · June 2, 2023, 9:45pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.