Using an additional dictionary in OpenNLP NE tagger

I'm pretty new to all this, so sorry if I've missed something obvious.

I have successully put together a workflow that includes the OpenNLP NE (named entity) tagger to identify the names of locations. But the model doesn't pick up all of the location names in my document, so I want to add a list of additional names that should be tagged as locations.

The node description for the OpenNLP NE tagger suggests that I can do this by selecting the 'Use additional dictionary file' option, and providing a dictionary file formatted with one entity per line, 'Firstname Lastname'.

However, the option that is in fact available is 'Use external OpenLNP model file', and it requres a .bin file rather than plain text as the description suggests. I can't see any other option to add another dictionary to the node.

Is there something missing from my installation of the node/plugin? Or do I have to prepare a .bin model file to add my own terms? And if that is the case, how on Earth do it?

I downloaded and installed the software only a few days ago, so I'm pretty sure my version is up to date.


Hi Sugna,

sorry for the confusion in the node description. The option in the Open NLP node dialog is indeed to specify an external model file, not a dictionary file. The best was to use a dictionary is the Dictionary Tagger. You can also use wildcards or regex in your dictionary with the Wildcard tagger. So, first use the Open NLP tagger with the build in models and afterwards use the Dictionary Tagger with your custom dictionary.

Cheers, Kilian