Custom Tagger Options (Type and Value) for custom taggers

With version 3.3, KNIME offers a rich collection of text enrichment taggers.  Some taggers (e.g., Abner, OpenNLP NE, Oscar, POS and Stanford NE) allow the user to select Tag Type and Tag Value via a drop down from appropriate choices built into the corresponding tagger model. By contrast, Tagger options in Dictionary and Wildcat taggers are "hard wired" and allow one to only select from fixed tagger models.  This restriction is most bothersome for the powerful StanfordNLP NE Learner/StanfordNLP NE tagger combinations, which allows construction of new tagging models via Conditional Random Fields.

Am I missing something?  If not, is there a workaround? If not, would it make sense to add a type/value customization feature?

--Paul 

Hey Paul!

Abner, Oscar, POS and Stanford POS taggers are hard-coded, since they have specific tags that should not be changed. 

The dictionary and wildcard tagger do have the option to select the Tag Type and Tag Value. Maybe you missed the "tagger option" tab where you can specify the tag and some other options.

The StanfordNLP NE Learner has the option to specify the tag that should be used in combination with the generated model ("Learner Options" tab). The StanfordNLP NE Tagger reads the tag from the modelfile and uses it for tagging. We cannot provide a tag selection for the tagger, because the Stanford built-in models are multi-tag models that tag at least three different classes (e.g. person, location, organization).

I hope I could answer your question. If not, feel free to ask. 

Cheers, 

Julian

PS: Don't get confused by the tokenizer models in the "general options" tab. They are "just" for splitting sentences into terms.

Hi Julian,

Thank you for the prompt response to my inquiry.  

I fear that I have not successfully communicated the point I was trying to make regarding the lack of capability to assign Tag Type and Tag Value to custom taggers built in KNIME including Dictionary, Wildcard and StanfordNLP NE Learner/tagger. To illlustrate my point I have created sample workflows for 4 different taggers (see attached): (1) Abner tagger, (2) Dictionary tagger, (3) Wildcard tagger and (4) Stanford NLP NE Learner. Let's look at each case in turn.

(1) Abner tagger is prebuild with Tag Type = ABNER and Tag Value = multiple values.  When Tags to String is executed a new string column with heading ABNER is created with entries corresponding to the built-in choices.  Everything works as expected.

(2) Here I want to run an instance Dictionary tagger node identified as Tag Type = PainPathways and Tag Value = cytokines. Unfortunately, there is no way to make that assignment.  The only choice I have on the Dictionary tagger > Tagger options dialog is dropdown box of pre-built Tag Type values and their corresponding pre-built Tag values. This forces me to select inappropriate values like SENTIMENT and POSITIVE for Tag Type and Tag value (see attached), which makes selecting the Tag Type value in Tag to String node and the Group  setting in the GroupBy node confusing.

(3) The Wildcard tagger exhibits the same limitations as in (2)

(4) Here the problem is introduced in the StanfordNLP NE Learner > Learner Options dialog, which follows the same pattern as above

I would like to make the following suggestion. I am assuming that the Tag Type and Tag value values are maintained in some XML file.  Minimally provide way for the user to expand the XML list via Preferences or expand/edit the XML list directly in the dialogs.

Thanks for your consideration.

--Paul

 

Hey Paul,

that cleared things up for me! 

Yes, you are right. Defining complete custom tags is currently not possible, but I will suggest it.

Thanks for your feedback and merry christmas!

Cheers,

Julian

Hi Paul,

you are right that it should be possible to define custom tag sets and tags and it actually is already possible. Extending the functionality of KNIME (e.g. like integrating custom nodes) is possible via various extension points. There is also an extension point for tag sets. You can find out about how to integrate custom tag set here:

https://tech.knime.org/for-developers-integration-of-custom-tag-sets

Cheers, Kilian