"Multiclass-Learner": Edit: MULTILABEL-Learner

Hi!

I need help with the following question…:

Given are roughly 1700 open answers from a questionaire. These answers were then manually, binary classified into independent categories/classes (see Example.xlsx); those classes are independent and non-exclusive.

Example.xlsx (10.0 KB)

For any single of the categories the following workflow (-an adaption of one of the example workflows-) produces good results:

Problem: in the string to documents node i can just assign one class to every document, which means i would have to create one process for every category. Is there anything like a “Multiclass-Learner”?

Any hint on how to do run the process once to learn many classes is appreciated!

Thanks for all your help,
Olaf

PS: I have seen the " MultiClassClassifier", but i’m not sure if that is what i want… (That is: i don’t understand it).
PPS: What i want is not multiclass learner but multiLABEL learner. Sorry for any confusion…

Hi,

How about using “Many to One” node?
This way you have all the categories in one column and you can assign them to the docs easily.

Best,
Armin

Hi Armin,

i tried it and i get the error message “Execute failed: Multiple columns match in row Row4”. If i understand your suggestion correct you want to make polynominal values out the binary classification, is that right?

Cheers,
Olaf

Yes.
Are you assigning multiple classes to a single answer?
I got no error for the Excel file you provided as you have a single class assigned to each answer.
If you have to assign multiple classes to each answer then I can find a way to duplicate the answers with several classes (and for each instance of the answer there will be only one class) then you can use “Many to One” and then “GroupBy” node to aggregate the answers and use a “Set” or “List” method.

The output in that case (assigning multiple classes to a answer) will be like this:
answer | Category
x | [CatB]
y | [CatB,CatA]
z | [CatC]

Best,
Armin

Yes, (or to be more precise as i just learned on Wikipedia binary classes to multiple lables).

My problem with this solution would be, that i have just a few cases for lets say Label_Z. When i group them into a set, there would be too few examples for that new-built set for the learner to learn on i assume. But i guess I’ll give it a try!

Thanks for the help!

I’m not sure if what I’m suggesting is correct in your case here, but how about using “SMOTE” node to have equal class distribution?

Best,
Armin