This workflow shows how to perform a lexicon based approach for sentiment analysis of IMDB reviews dataset. The dataset contains movie reviews, previously labeled as positive/negative. The lexicon based approach assigns a sentiment tags to words in a text based on dictionaries of positive and negative words. A sentiment score is then calculated for each document as: (number of positive words - number of negative words) / total number of words. Dataset Reference Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).
This is a companion discussion topic for the original entry at https://kni.me/w/DjLYUYhlRDhzXt15