I just started to work with knime and I like so much.
I have a workflow with the objective of filter countries from documents.
Then, I have parsed the documents and after that created tags from a dictionary tagger (with a list of countries).
However, terms like Iran/Iranian counts as specific frequencies. I would like to count these kinds of concurrences as one instances of one country. I am sending the picture of the workflow.
I tried almost all “replacer nodes” but they did not work, I realize that is because the tagger nodes.
Thanks for the attention
that is a typical problem in text mining. You need to normalize the words before you count. Stemming is an option (or lemmatization but we don't have a lemmatizer node). But in your case the words Iran and Iranian would not be stemmed to the same stem since the first is a noun and the second an adjective. To replace these words using a dictionary you can use the Dictionary Replacer (2 inports) node. As second input the nodes takes a dictionary with search and replacer words. Use the replacer node before the stemmer. Have you tried that?
Thank you for the answer.
I did what you say and this worked, but there was a little and important detail. The use of case converter node before the replacement. The dictionary is case sensitive.