help with textual analysis

I am a total beginner at KNIME. I am trying to analyze two dinstinct groups of information, organized in two different excel files, but in case it is needed I can group the information in one single excel file. Both files are a list of phrases, organized in a single column (about 200 rows in the first one, 350 in the second). After preprocessing the information, I need to analyze the cosine similarity between the phrases in the first group and the ones in the second one (considering them as a whole, i don’t want to calculate the similarity between each phrase). Then, I would like to find out the main themes discussed in the phrases (for each group of the two) with topic modelling; group the phrases in every group with text clustering and analyze common concepts in a co-occurrence analysis. As mentioned before, I am a complete newbie to KNIME so I’m very confused on how to achieve this. I managed to understand how to preprocess the files, but: a) I don’t know if it would be useful to group the information in a single excel file or not b) I find myself stuck after preprocessing, as I really don’t know how to advance with analysis, given the particularity of the task. If anybody could help me, I would be immensely grateful for your support :smiling_face_with_tear:

Hey there,

that seems like quite a project you have on your hands there :-).

To get you started this seems to scream for the KNIME text processing extension:

But I bet you already have that installed.

There are some example workflows on the hub around topic modelling:

… and there is a dedicated Level 4 specialisation (L4- Data Science - Text Processing):

Not 100% sure but that course might be available as self-paced online…

you can find out on the learnupon platform:

https://knime.learnupon.com/catalog

3 Likes

Thank you very much!!! I’ll start right now exploring the workflows and specialisation,… with those and your help I might figure it out. Thanks again for your support, I appreciate a lot. I’ll let you know if I find myself in trouble again… i hope it won’t be the case, even though it probably will ahah

4 Likes