Twitter Sentiment Analysis with Supervised Learning Models

Hi,
I’m working with sentiment analysis with tweets. I have little more than 17 000 tweets which are categorized into three classes (pos, neg, neutral). My goal is to train several different supervised models to evaluate the accuracy of each model and add data visualization views. Below I will include my current workflow (the dataset with label for each indivudal tweet is included in the CSV file):
Project Sentiment Analysis SML_export.knwf (434.7 KB)

Is there anything I can do to improve my workflow, whether it is to improve the model accuracies, add visualizations or add data preparation/preprocessing steps?

Thanks in advance,
Huseyin

Excellent work!
I think you can try out the BERT Langauage model to improve the accuracy of the task, the link to the example workflow is given below

Example workflow : Sentiment Analysis with BERT – KNIME Hub

1 Like

Thank you! Do you know why Knime stops responding when I run the Random Forest trainer? In addition, should I change my ML parameters or just use the standard configuration as it is? Since the majority of the tweets are negative, I tried the Equalize Size Sampling node for the SVM trainer, but I got worse accuracy. Is it worth trying to use this for the other algorithms? Also wondering, if there are graph plots for evaluation for multiclass predictions. I know that ROC curve can be used for binary classification problems, but is there other similar alternatives for multiclass classification problems ( i.e. f1 score measure)? Finally, is there any other visualisation node which can come in handy for such workflow? I will try to integrate Tableau.

BR,
Huseyin

For evaluation of Multiclass classification,
you can check the below thread

About changing the parameters of the training algorithms, yes please try out different parameters to understand the optimal parameters for your training task.

1 Like

Thank you for your valuable input!
I have a question regarding adding color to my bar chart. I want to add colors to differentiate my three classes, but even if I use the color manager node, the bars are only appearing in one color. I tried several methods asked by other users in this forum regarding the same question, but I haven’t succeeded in applying them in my workflow. The bar chart in Data Visualization node is connected to my csv dataset.The sentiment column header contains my three classes associated with each row. Is there a simple solution to this problem?

We have created a simple workflow here on using Bar chart with different colors based on Color manager node

Please try this out and if does not work, kindly share the workflow file (without data) so that I can understand your problem…Hope this helps

I tried the example workflow, but couldn’t get it to work with my workflow. This is the result I got

. As you can see, all three classes are in the same color.

My workflow:
Project Sentiment Analysis SML.knwf (448.5 KB)

I tried to color the bars, using a sample data, my suggestion is

  • prepare the data according to the schema shown in the attached workflow

Bar Chart with Colors.knwf (11.6 KB)

1 Like