Classification Model with multiclasses

Hi guys, I have a question regarding my workflow. I have uploaded an Excel file in Knime containing emails. These emails each have an associated category that is set. The number of categories is 30 categories. Based on this data, I have to create a classification model or predictive model that classifies the data into the correct category. The number of emails per category is very different … for example, one category has 200 emails, and another category has 50 emails. Attached you will find my workflow. I would like to compare different algorithms with each other like the decision tree, SVM and K nearest neighbour. When outputting the results, I have now seen in all three algorithms, the values ​​Accuracy and Cohen’s Kappa are not spent. What could that be? Do you see anything wrong? Is there a way to connect the results of all algorithms with one node and to compare them?

Thanks in advance,
Canan

Hi Canan!

You got ROC Curve node. Check it out.

From node description:

You may compare the ROC curves of several trained models by first joining the class probability columns from the different predictors into one table and then selecting several column in the column filter panel.

It should help.

Accuracy and Cohen’s Kappa are given on overall data. Not per class.

BR,
Ivan

Hey Ipazin,

Thank you for your prompt reply. With which node should I connect the roc curve node…is it possible to compare the results of all three algorithms.
And one more question…how can I get the information regarding to accuracy and Cohens kappa?

Thanks and kind regards,
Canan

Hi!

I see now that ROC curve is only for binary classification. You can’t use it here. My bad :open_mouth:

You have information about Accuracy and Cohen’s Kappa for each model. Last row. Compare them :slight_smile:

Br,
Ivan

Ok thank you very much, this means there is no way to compare these models in one node?!

Sorry, i did not see the last row:sweat_smile:

I have another question, do you know why the results between the decision tree model and SVM are so different…


The SVM Model classified 35 data wrong and the decision tree model only 3…is there something wrong with the settings?

Why is there no learner or predictor node for k nearest neighbour?

Thanks and regards,
Canan

Hi @anon33357744

There are two kNN nodes: K Nearest Neighbor and K Nearest Neighbor (Distance Function).

best,
Gabriel

Hi Gabriel,

thanks, but the K nearest neighbor is for a large set of data, i need the k nearest neighbor (Distance function) but i get the note that there is a failure. Why?
Thanks and best,
Canan
image

Hi @anon33357744
You need to connect a Distance Function node to the grey port of the KNN node, you can find them under _Analytics -> Distance Calculation -> Distance Functions _. These nodes allow you to specify and re-use distance functions in several places in your workflow.

best,
Gabriel

1 Like

Hi @gab1one,

I tried it out, but there is an error by executing this node. Is it possible to send you my whole workflow via mail so that you can take a look at it. You would be a great help, because I’m just desperate.

Thanks and kind regards,
Canan

Hi @anon33357744,
Yes that is possible, you can either upload it to the forum or send it to me at gabriel.einsdorf@gmail.com.
best,
Gabriel

Hey @gab1one,

thank you very much, I send it to you :slight_smile:
Kind regards,
Canan