Dear Knime community,
first of all it’s really amazing to have this very vivid and supportive community and I hope you can help me with my case.
For my Master Thesis, I want to research the differences of company values between family and non-family firms. Therefore, I would like to conduct a Text Mining task on the company’s annual reports (in PDF format) in which you usually find the company’s values and beliefs.
So as an outcome, I would like to generate a list of words/terms that represent the most stated values for each, a family and non-family firm.
I’ve already started working on a workflow which you can find attached. However, I do need your help for a couple of points to make it as good as possible:
-
I didn’t find a way how to separate the analysis for family firms vs. non-family firms. Should I feed in the annual reports via 2 separate PDF Parser or is there any way to split up the documents depending on family firm / non-family firm? Right now, I have all the documents in one column, so it’s not possible for me to separate between the organization forms.
-
Based upon my first question: Is there any tool to analyze/compare the outcomes of family firms/ non-family firms?
-
Is there any tool to additionally filter the terms in such a way, that there will be only company values generated such as diversity, team-orientation, respect, collaboration and so on?
-
Do you have any other analyses in mind how to draw some more findings/insights out of it? For example, would it make sense to do a sentiment analysis?
Sorry for so many questions at once but I’m new to Knime and not aware of the endless opportunities this program offers.
Your support would be highly appreciated!!
Cheers,
Christopher