Negation handling

f.villarroelordenes · August 13, 2013, 12:36pm

Hello,

I am a new user in Knime and I am working on a sentiment analysis project. I am doing text categorization by using a sentiment (positive/negative) dictionary, BoW, and then TF-IDF. I would like to know how I can handle negation issues (not good, didnt like, etc). Which nodes would be useful to account for negation issues when counting the frequencies of terms (n-grams)?. I have seen that by using parse trees this it is possible to sove this issues, but not sure how to do it in Knime. Please any guideline would be very useful.

Thanks!,

Francisco

kilian.thiel · August 13, 2013, 1:54pm

Hi Francisco,

so far there is not deep parsing provided by the Textprocesing plugin. With the Wildcard Tagger node it is possible to tag terms based on regular expressions. Negations, matching these expressions can be found and tagged with a certain tag, which can be filtered or counted later on.

To count frequencies of terms the TF node is the right node. In the dialog you can specify relative or absolute frequencies to count. The N-Gram node can be used to find ngrams and count their frequencies.

Cheers, Kilian

f.villarroelordenes · August 13, 2013, 2:40pm

Thanks Killian. I will try with the Wildcard then. Is it any way to integrate something like Stanfornd Parser (You already have the stanford Tagger node), this will help me to detect wor dependencies associated with negation?.

Best!,

Fco

kilian.thiel · August 13, 2013, 7:23pm

So far it is not planned in the near future (next few months) to integrate deep parsing. But of course it is possible to write your own node and use the Stanford lib internally. If you are interested in writing a tagger node i can give you some tipps if you want.

f.villarroelordenes · August 14, 2013, 10:51am

Hi Killian yes that would be great I am very interested. You can give us some tips and I can start trying and see how far I get.

Also I wanted to ask you a simplier question. We are ussing the Dictinary tagger for tagging positive and negative words, however within the dictionary node there is no a TAG value and type for positive or negative. So till now I am just ussing tag Values that pre exist. Is it any way to create a tag type and value like sentiment; positive and negative?.

Thanks for your help!

kilian.thiel · August 14, 2013, 12:02pm

There exists an TagSet extension point, that allows for implementing and integrating your own tag sets. However, you have to write some Java code to do that but it is possible to create your onw tag sets.

For the positive, negative tagging a quick and easy way would be to use two arbitrary tag values and filter / count them later on with the corresponding filter node.

I will write a small tutorial about, how to integrate your own tag set and your own tagger nodé in the next days and publish it on the Textprocessing site.

f.villarroelordenes · August 14, 2013, 12:12pm

Thanks a lot Kilian, this will be really helpful for our project. looking forward.

Best,

Francisco

kilian.thiel · August 19, 2013, 9:45am

Hi Francisco,

here is a tutorial about how to create and integrate a custom tag set:

http://tech.knime.org/for-developers-integration-of-custom-tag-sets

The next tutorial, about how to integrate a custom tagger node will follow.

Cheers, Kilian

f.villarroelordenes · August 20, 2013, 10:22pm

Thanks a lot Killian, I will see if I can manage. Looking forward for the next tutorial :).

Best,

Fco

kilian.thiel · September 23, 2013, 11:48am

Hi Francisco,

the tagger tutorial is online now: http://tech.knime.org/for-developers-integration-of-custom-tagger

It took a little bit longer to write it, sorry.

Cheers, Kilian

system · June 2, 2023, 9:50pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.