How to convert columns to tags?

Hi,

i do have an excel with 3 columns.

Column A is a company name, column b is a country name and column C is a text message, which is published by company from A in country from B

I want to have a tag cloud representing all text messsages, but the colouring should be based on either compnay name or country, i.e. all top words from company A should be red, all top words from company B should be green etc.

How can i convert the column information into a tag, which then can be used by the colour manager node?

Any help is very well appreciated.

Thanks

Hi siebenund60,

to create a tag cloud colored by company use the GroupBy node to group your rows by company and concatenate the text values with whitespace separator. Then use the Strings to Document node to create one document for each company. Set the company column as "Category" column, you can extract this information later on to color the rows by company.

Filter Stopwords, punctuation marks, numbers, and unimportant terms, and/or use a keyword extractor node (e.g. Keygraph) to keep the important terms for each document (company). Then use the BoW Creator node to create a bag of words and then the TF node to compute term frequencies. Now use the Document Data Extractor to extract the company information (in the category if the document) and assign colors based in the company using the Color Manager node.

Finally use the Tag Cloud to visualize the terms of the bag of words.

Analogous  you can do the same for countries.

Cheers, Kilian

1 Like

Hi Kilian,

 

thank you very much for your helpful answer.

I did as you described, works fine up to the Document Data Extractor node.

Within the colour manager node i can select the "category" columns, but i cannot assign any colours.

"Nominal" radio button is grey, the "range" button is selected and uses only one colour, as the information within category is not a number range.

Any hints how to enable the Nominal button?

 

Thanks, siebenund60

Hi,

before applying the Color Manager make sure the Categories column is a StringColumn (which it should be by default). For nominal values ranges can not be applied. For each possible value a color can be assigned.

If there are way too many possible values  the Color Manager can not append colors to these amount of values reasonably. To compute the domain of your data set and see how many possible values there are you can use the Domain Calculator.

Cheers, Kilian

Hi Kilian,

 

now it works like a charm.

Thank you for your help,

siebenund60