Natural Language Processing with BERT

DataUnicorn · October 25, 2023, 1:58am

Hi everyone. I am working on an assignment (sentiment analysis on a movie dataset) and I keep getting stuck with Python (extremely buggy), I keep getting bogged down because the code and packages need to be tweaked or updated. I found a KNIME workflow using BERT. Compared to Python, KNIME appears to be a piece of cake. Has anyone used KNIME for BERT sentiment analysis? I am looking for the help that I can get. Thank you.

ScottF · October 25, 2023, 2:20pm

Hi @DataUnicorn and welcome to the forum.

You are in luck, there are a few separate example workflows on the KNIME Community Hub that should help you get started:

DataUnicorn · October 26, 2023, 4:07pm

Hi @ScottF, thank you for the templates. They are going to come in handy. It appears that I finally completed the groundwork (i.e., created the Python environments) to start using the BERT nodes.

steffen_KNIME · October 26, 2023, 7:40pm

Hi @DataUnicorn,

what do you mean by creating Python environments to use the BERT nodes? They come out of the box with a bundled environment and should work right away. Did you experience any issues with this? And if you create your own environment in the settings of Preferences -> KNIME -> Redfield BERT, a fresh environment can be created via New environment... which can then be manually extended by the libraries of your choice.

Best regards
Steffen

DataUnicorn · November 4, 2023, 12:28pm

Hi Steffen (@steffen_KNIME), thank you for you response. Things hardly ever go smoothly for me and I have bad luck with running Python related algorithms (a Python environment is required for this setup). I did manage to use a template (might have been provided by Scott-- @ScottF).

I managed to use a movies dataset (reduced the size). I do not know what I am doing incorrectly, because I am getting an accuracy rate of 48%. I was able to run the algorithm a few times and the accuracy rate is pretty much the same.

This appears to be a simple algorithm and I probably just have to make a simple adjustment somewhere. If anyone has any suggestions, then please share them.

Thank you.

ScottF · November 6, 2023, 12:34am

Did you enable the fine-tuning option in the learner node? I recall playing around with this workflow earlier in the year, and if I remember right that made a significant difference. (It takes much longer to execute, of course.)

DataUnicorn · November 6, 2023, 2:50pm

Hi Scott, @ScottF, those results are with the fine-tuning enabled. Interestingly, this is what was achieved without the fine-tuning enabled.

I will keep on tinkering with this. I do recall that, in one of the runs, I achieved around an 80% accuracy score, which is not spectacular but a lot better than the current results. Unfortunately, I did not save any of the information regarding that run.

system · February 4, 2024, 2:51pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.