STTS Tagger - Multiple Tags per Term - Nomalize


I want to analyze German texts and use the Stanford Tagger.

If I use the "BoW" I get e.g. for the Term "Visual": the following tagging for one document.

I would like to find out which tag is with highest probability the correct one (e.g. most tagged)

What I tried:

"Group by" Term and Tag and take highest counts of occurences

I could not find out how often the term within the document is tagged with one Tag. Problem was that "TF" gives me the complete absolute frequency of the term, not the Term tagged with one specific tag.

Additionally for the occurences with multiple Tags, I could get only the last assigned one with "Tags to String".

Hope there are any ideas you can share.



Hi Bernd,

it is unfortunately not possible to count how often a tag was assigned to a term in a document, if more than one different tags have been assigned to the corresponding words. The TF node works only on the words and does not take into account the assigned tags. Thus grouping does not work. Sorry, I don't see any solution for that, except to implement a new node.

Cheers, Kilian

Hi Kilian,

Thanks for your answer, at least I know that I did try something which is not yet implemented!