Tagging Type abbreviations

Hi!
Being new to text processing, I have a very basic yet important question.
After having applied the POS and Stanford tagger nodes, the data is enriched. Yet, when for example using the wildcard tagger or tag filter, a ‘tag type’ should be defined. I cannot figure out the meaning of the abbreviations, who can help me with that.

Moreover, currently, I am developing a workflow focussed on English text, yet I wish to analyse Swedish text too, and in combination with-. As far as I’ve seen I need to download a Swedish lexicon and use that as tagger (using the dictionary tagger), though, I am in doubt if ‘that’s it’, or that there are many more steps to follow. Any advice in regard to this topic is welcome.

Thank you!

Laurien

1 Like

Hi @Lau3 and welcome to the forum.

Here’s a glossary of POS tags from the Peen Treebank set that will be useful: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html.

As for Swedish text - that’s a good question. Since KNIME doesn’t natively support Swedish, I think that manual tagging would have to be the way to go.

1 Like

Hi Scott,
Thank you for your quick reply!

I am familiar with the glossary - grateful they made one:) Though this is not exactly what I meant. I’ve attached a screenshot of what I mean specifically.


The abbreviations in the drop-down menu of the ‘Tag Type’ is unclear for me, would you be able to help me with that?

Thanks for you thoughts on Swedish text, I’ll play around with manual tagging!

Regards,

Laurien

Ah, sure, let me see if I can give a quick rundown. Some of them overlap a bit:

NE / ENTITY: named entities (places, persons, orgs, etc)
FTB / FTBCC+: French Treebank tags
UDPOS: Universal POS tags
POS: Penn Treebank POS tags
ANCORA: AnCora Spanish tags
SENTIMENT: Sentiment tags
ABNER/PHARMA: biomedical named entities
ZEMNLP: Turkish POS tags
ARABPOS: Arabic POS tags
STTS: German POS tags
MWT: Multiwordterm
OSCAR: Chemical named entities

Does that help?

2 Likes

Thnks Scott!
One more question, if I include the manual Swedish Dictionary as tagger, which tag type do you recommend me to use?

I think either POS or UDPOS would be fine in that case - it depends on how granular you need your Swedish POS tagging to be. UDPOS is simpler so that might be a good place to start.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.