is there a possibilty to use the "Term co-occurrence counter" node for a tag cloud, so that e.g. the tag cloud shows not only the term "weather" but "good weather" and "bad weather", becaus those two other terms (good and bad) are used frequently with "weather".
Yes, you just need to extract both terms to strings (Term to String) than combine the two strings (Column Combiner) and finally convert the combined String back in to a Term (String to Term) on which you can finally apply the TagCloud. :)
After the second "String to Term" I used the "Column Filter" to continue with the "Document" column and the new columns with the "combined terms". After this I tried to apply the Frenqueny node (TF, IDF). However the only value I get is "0" (in all of the freqncy nodes). The Tag Cloud than is completey strange: it doesn't use "normal words" but the words with its tag....
When I use the value counter after the second "String to Term" node, I get a table like this (the values of the RowID actually contain the tags as well):
RowID
count
good weather
48
bad weather
15
stormy day
13
sunny day
9
Can't I just get a tag cloud out of this? Or is there any other possibilty?
this is exactly what I did but after the "String to Term" node. With the IDF node (I have many docuemnts with only few words, with the TF node I only get the value: 0) I recieve values. However when I apply the Tag cloud it does not use "good weather" or "stormy wind" but " "good [ADJD(STTS)]", weather [NN(STTS)]" and " "stormy [ADJD(STTS)]", "wind [NN(STTS)]" " with the tags...
Do you have a suggestion how I can get rid of this?
I solved the problem with the frequencies, by using the "GroupBy" node to sum up the Sentence Occurences created by the "Term co-occurrence counter" node.
However the Tag cloud still contains the punctuation. Do you know how to get rid of them?
Yes, in the Column Combiner, just delete the quote character and maybe the delimiter as well. And than use the Option Replace Delimiter by. Afterwards there are no more punctuations.
oh yes! I did not recognize this option! Thanks for the clue!
Do you know if it is also possible to use a Sentiment Analysis with this "Term co-occurrence counter" counter? So that eg. the negative words in one pair of words to distinguish between positive and negative words?
I did not fully understand your question with the term co-occurrence counter and sentiment scores. You can of course assign sentiment labels to the terms, extracted by the co-occurrence counter. If you have a dictionary with terms and corresponding sentiment labels you can simply join the sentiments to the term co-occurrence table by joinning by the terms.
what I want is a tag cloud that colors postive words in green and negative words in red (keep neutrals grey). So if I now combine the terms the previous sentiment tags are erased. I tried to join the previous table with the sentiment tags, but due to the "Column Combiner" the two table don't match anymore...
I assume you have a dictionary with positive and negative words. Use this dictionary for tagging with the Dictionary Tagger. You need two tagger nodes, one with the positive list, assigning positive labels and one with the negative list assigning negative labels. Do filtering, and other preprocessing. Create bag of words. Extract tags as string with the Tags to String node. Compute the tf frequency with the TF node. Assign colors based on tags with the Color Manager. Than use the Tag Cloud node.
what you discribe is the sentiment analysis I already used. However I want to combine this sentiment with the "Term co-occurrence counter". Is this possible?
Maybe you have an idea. Find my example workflow attached.
not sure if I got that right: I assume you want to combine sentiments from single terms to a combined sentiment from two terms. You could extract the tags (Tag to String) and than specify a rule on which the sentiment is combined (Rule Engine). To group equal pairs of terms you can create a unique ID that is independent of the terms position, i.e. A - B is equal to the pair B - A.