set category in "String to Document" module

I'm using KNIME for my graduation thesys and I have to cluster/classify a number of documents. Such documents have been preliminary classified by a human operator and their class (or category) is in a column, while other columns are for title, content and author.

I wish I could link the content of the "category" field to the classification column (just as I do for title, content and author). Is there a way to obtain this?

I went through some example workflows, but in that case the category is assigned as fixed string in (two) different branches of elaboration flow; this solution is not appliable in my case because I have a huge number of classes and furthermore they could likely vary over time.

I tried to figure how I could use a Java Script module or a flow variable to achieve the result, but could not find a way.

Any suggestion is welcome.

The textprocessing nodes in KNIME 2.6 (due end of July) will have the desired functionality. You can choose the category based on a column from the input table.

I see your point. Unfortunately it is in the current version (2.5.x) not possible with the "Strings to Document" node to use a column as category column. Fortunately this has been fixed and will be possible in the next version (2.6).

As a workaround for 2.5.x you can do the following:

Group By (over Category column) -> Chunk Loop Start (1 row per chunk, to loop over each category value) -> Reference Row Filter (to filter all rows of original table with the current category value) => Row to Variable (inject current category as flow variable); Strings to Documents (convert string to document and use injected variable to set as category value) -> Loop End

Cheers, Kilian

Update: i just created a small example workflow showing how the described workaround works.

Thank you, Thor,  for the good news about the functionality in next version. I'm not sure I'll be able to use for my project, though, because I hope I have finished it for that time.

So double thanks, Killian, for the workaround. I'll go through the example workflow and try to adapt to my project.

Let me thank also for your great job here, this is the first time I post a question, but your replies to other people helped me several times, since I begun this project.


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.