Re-tagging customer opinions and identifying improvement areas

Hello KNIMErs,

I would need your suggestions for the subject I have never worked with.

I have access to raw data with clients’ opinions and to the platform that turns those opinions into sentiment analysis, tags, tag cloud, and the dashboard. Sentiment analysis looks OK (ratings are in line with the sentiment), however I have some doubts regarding tagging.

The key raw data are:

  • OPINION descriptive clients’ feedback. A couple of languages are present here.

  • RATING of 1, 2, 3, 4 or 5 where 1 stands for the lowest rate and 5 stands for the highest one.

  • SENTIMENT with positive, neutral, or negative values assigned.

  • TAG # - multiple columns dedicated for tags. The thing here is that there are non-blank opinions with no tags assigned.

Other data refers to products, distribution channels, clients’ details, etc.

Unfortunately, I can’t share sample dataset and – therefore - I can’t use public AI platforms while building the workflow.

I would need your suggestions on what the logic/sequence of nodes should be used to achieve two main goals:

  1. Re-assign tags based on OPINION.

  2. Identify areas requiring improvements.

Then, I could start building my tool based on your suggestions.

Any support is more then welcome.

PS. I have already played with the workflow available here Importing, preprocessing and visualizing textual data as Tag Cloud – KNIME Community Hub and my own dataset.

Some update:

  1. I have added metanode to translate OPINION col to English (depending on source language of course).

  2. I have run the workflow specified above to generate tag cloud.
    The challenge: tag cloud contains one-word tags only while phrases would be expected. Tag ‘product’ is neutral while there is huge difference between ‘good product’ and 'bad product’.

  3. I have tried other tagging options with no positive results (see yellow annotation below).

Any suggestions to get identify phrases present in OPINIONs as tags?

Hello @Kazimierz
Your task resembles me to JKI S02-CH06’s proposed challenge.
You may check some of these solutions for insights.

You can find even mine one published.
BR

2 Likes

Thank you @gonhaddock for redirecting me.
I will come back with the feedback after processing solutions to JKI challenge.

OK, I’ve gone through proposed solutions and discovered that:

  • They don’t refer to tagging.

  • They use sentiment confidence index that is available in entry dataset. No such index in my dataset, so I would need to calculate it – and this is beyond my capabilities today.

1 Like

@Kazimierz to classify or categorise texts you can try and ue a local LLM so the data will not get leaked to some large model on the web.

You can tell the model to come up with a structured JSOn that you can then use in a table. Also you can ask it to do some classification.

Blog: “Multimodal Prompting with local LLMs using KNIME and Ollama

2 Likes

In addition to the above, you could give simple BERT a try, specifically, the zero shot classification models in case you don’t have a labelled set to fine-tune. For simple classification problems, llms might be overkill , but both are worth giving a shot (e.g. you could compare results on a small sample)

Redfield BERT Nodes – KNIME Community Hub Redfield BERT Nodes – KNIME Community Hub

2 Likes

Thank you @gonhaddock @mlauber71 @Add94 for your suggestions.

Unfortunately, I must admit that I won’t follow up with this topic. This a kind of side task, not a priority at all, and I would need to invest a looooot of efforts to grab the necessary knowledge. Not now.

We could treat this topic as closed with no solution.

1 Like