Column A is a company name, column b is a country name and column C is a text message, which is published by company from A in country from B
I want to have a tag cloud representing all text messsages, but the colouring should be based on either compnay name or country, i.e. all top words from company A should be red, all top words from company B should be green etc.
How can i convert the column information into a tag, which then can be used by the colour manager node?
to create a tag cloud colored by company use the GroupBy node to group your rows by company and concatenate the text values with whitespace separator. Then use the Strings to Document node to create one document for each company. Set the company column as "Category" column, you can extract this information later on to color the rows by company.
Filter Stopwords, punctuation marks, numbers, and unimportant terms, and/or use a keyword extractor node (e.g. Keygraph) to keep the important terms for each document (company). Then use the BoW Creator node to create a bag of words and then the TF node to compute term frequencies. Now use the Document Data Extractor to extract the company information (in the category if the document) and assign colors based in the company using the Color Manager node.
Finally use the Tag Cloud to visualize the terms of the bag of words.
before applying the Color Manager make sure the Categories column is a StringColumn (which it should be by default). For nominal values ranges can not be applied. For each possible value a color can be assigned.
If there are way too many possible values the Color Manager can not append colors to these amount of values reasonably. To compute the domain of your data set and see how many possible values there are you can use the Domain Calculator.