I want to analyze German texts and use the Stanford Tagger.
If I use the "BoW" I get e.g. for the Term "Visual": the following tagging for one document.
I would like to find out which tag is with highest probability the correct one (e.g. most tagged)
What I tried:
"Group by" Term and Tag and take highest counts of occurences
I could not find out how often the term within the document is tagged with one Tag. Problem was that "TF" gives me the complete absolute frequency of the term, not the Term tagged with one specific tag.
Additionally for the occurences with multiple Tags, I could get only the last assigned one with "Tags to String".
Hope there are any ideas you can share.