tagging of neighbour words

Hi,

I would like to tag terms according to keywords in a dictionary plus the words left and right of the keywords. I added regular expressions "([a-zA-Z]+ )*" + <keyword> + "( [a-zA-Z]+)*" but the result is irregular. Sometimes just the keyword is tagged and sometimes several other words left or right, and sometimes both. Is there any possibility to control this?

If the sentence starts with the keyword then no other word left of the keyword should be tagged. So, just within a sentence the tagging plus 3 words left and right should be done.

Any help is appreciated// thanks

 

 

 

Hi mfsn,

in our regex you are using * as quantifier, which means 0 or more. If you want to tag only one word left and right (or no words) you should ? as quantifier. Use {n,m} for at least n times but not more than m times. See http://docs.oracle.com/javase/tutorial/essential/regex/quant.html for a documentation of Java regex quantifiers.

Cheers, Kilian

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.