Keyword Research - extract a keyword consisting of 3 words to be considered as 1 one term

Hello,
I’m doing a long tail keyword analysis, so I have my list of 10.000 queries.
Now, before using the bag of words and then the frequency calculation I need to consider 3 words as a term: e.g.
go for joe = one keyword/term
joe = 1 keyword if is not preceded by “go for”

I have to consider also that “go for joe” is written in different ways:
gfj
g4j
go 4 joe
go for joe

  1. How can I extract my 3 words keyword as 1?
  2. How to differentiate the cases in which there is only “joe”

Can someone help me?
Thanks in advance

Hi @Tilux,

I build a little example workflow for you, which you can find on the KNIME Hub: Extracting a keyword consisting of multiple words and written in different ways – KNIME Hub

The idea is to create a table that contains all different ways and defines one representative for all of them. Next you can tag you document based on this table, so that your three words are converted to one term (Dictionary Tagger with “set named entities unmodifiable” unchecked). Afterwards you can replace all different ways with the representative (Dictionary Replacer).

Could this be a solution for your use case?

Have a nice weekend
Kathrin

1 Like