classifying document terms instead of individual words

hi all

urgently please help me

I have a problem in classifying terms 

term such as "online learning" to be treated as a whole not "online", "learning"

the classification result for the testing data produces only one class for all the documents which is wrong

these terms construct the document

I tried 2 different ways but no accurate results are given

i want the vector model to show the terms and thier occurence in the document and the classifier gives accurate class

attached the dataset and the workflow

help me please 

any help please

Hi singing bird,

please, take a look at the following example for document classification:


Hope this is helpful,




there is no dedicated node to detect and tag multi words e.g. text mining or online learning. However you could use the NGram node to create 2 grams and count their frequencies. Then you can filter frequent 2 grams and use them as input for the dictionary or wildcard tagger node. These nodes search documents for terms in input dictionary, tag them and group them if they are multi words.

Cheers, Kilian


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.