Positve, Negative Classification

Hi all,

i've been testing a little bit with KNIME Textprocessing features, so what i want to do is to classify data (sentences) to positive or negative.

For example ->

The game was horrible and Manchaster lost --> negative

It was a great day for Arsenal they won the game clearly -->positive

I' ve found only example workflows where the text data is splitted in words and then tagged as positive, negative or neutral. 

So i hope you know what i mean, i want to classify sentences as positve or negative not only words. 

Thanks for help

 

Andy

Hi Andy,

 

you could extract the sentences from your documents using the "Sentence Extractor" node. Convert these sentences to documents using the "Strings to Documents" node. Then assign positive, negative tags to the words of the documents based on a dictionary using the "Dictionary Tagger" node. Convert the list of documents to a bag of words ("Bow" node) and count the positive and negative for each document ("Pivoting" node is a convenient option here). You end up with a row for each document (in this case sentence) and two additional columns, one containing the number of positive words and one containing the number of negative words. Then decide e.g. by using the "Math formula" node whether the document is positive or negative.

 

Cheers, Kilian

Thanks for this useful info. Do you have an example workflow for it ?

Hi @abdkhirfan -

This workflow is uses a process very similar to what Kilian describes above.

The main exception is that it doesn’t include a Sentence Extractor node, as it treats each movie review a single document. If you really want to evaluate each sentence individually, you could easily add a Sentence Extractor node after the initial Read Data metanode.

Have you had a chance to try this workflow out yet?

Yes, i have done this, but the results were a bit ambiguous. Does the dictionaries added have to be relevant to the topic in hand? I am analysing reviews for Marketing automation software ? I took the same dictionaries used in this workflow. Moreover, I don’t know where i can visualize the total amount of positive and negative sentiments.