Classifying Positive and Negative tweets in knime

Hi,

I am trying to use knime to classify tweets into positive and negative sentiments then use the neg/pos as a class label to use to train a classifer. Is there any way to do this? or any material that can help? I have been able to input the tweets and then add dictionary taggers to determine postive and negative then calculate the document pos/neg score, but how do I then turn the score into a class in order to train the classifier against the document vector?

Thanks for any help

Hi,

I think you should use the example below :

http://www.knime.org/blog/sentiment-analysis

best regards,

Fabien

If you already have a dictionary with positive and negative words and if you already have counted these words in the documents it is easy to create a class label. You can use e.g. the Pivot node and use the Document column as group column, the pos/neg word column as pivot column and the sum of the absolute frequencies as aggregated values. You will end up with a table consisting of one row per document and two cols, one with the corresponding number of positive words and one with the number of negative words. Now you have to think of a score e.g. subtract the number of negative words from the number of positive words. Of course you can also normalize the numbers as well. To compute the score use the Math Formula node. Once you have the score you have to make a decision, e.g. a positive score means positive sentiment and a zero or negative score means negative sentiment. Therefore use e.g. the Rule Engine node.

Cheers, Kilian

Thanks Kilian.thiel, this is a great help :)