I’ve been trying to work with the Text Processing Extension for 2 months because I am asked to write my bachelor thesis on this topic.
My task is to analyze German Interviews of different people regarding one topic. I am free in choosing an analysis. I am also checking the sentiment analysis to compare the two groups and I would like to show the main message (in keywords) and the connections between those words. By doing the same analysis two times for the two different groups, I want to be able to compare the results of the two groups in order to show differences and similarities.
Now my questions:
- Is that keyword and connection analysis possible or is there any other analysis in knime, which would be better for analysing Interviews?
- The Interviews contain 4 to 5 different subtopics. I can decide to analyse one of them or the whole interviews. For me it seems easier to analyse just one topic but I am wondering whether it is enough for knime to have just about 300 words in one interview sometimes?
Thank you in advance!
That’s an interesting topic!
I guess you can approach your analysis in different ways, depending on your research question(s).
KNIME Analytics Platform provides a couple of nodes to extract keywords from a single document, that is the Chi-Square Keyword Extractor node and the Keygraph Keyword Extractor node. Moreover, you can also compute TF-IDF. The performance of TF-IDF are usually comparable with the one from Chi-Square Keyword Extractor and the Keygraph Keyword Extractor nodes.
Ones you have extracted the keywords from the documents you can compare those between documents. To do that, you can use similarity search for example.
An example workflow showing this is available in the 50_Applications folder under the EXAMPLES server. The path to the workflow is the following: knime://EXAMPLES/50_Applications/33_Emil_the_TeacherBot/Emil_the_TeacherBot
To answer to your second question: KNIME does not have any restriction on this. If you want you can analyse a document containing 300 words. However, the number of keywords to extract from a text really depends on the length of the document being analyzed and on further steps required in the analysis.
Hope that helps! Good luck for your thesis!
Thank you for that answer, it gives me good input for further thinking!