Evaluating Classification Model Performance

Hub · June 27, 2019, 7:27am

Train a classification model using the Decision Tree algorithm. Evaluate the accuracy of the class prediction by scoring metrics, ROC Curve, and Lift Chart.

This is a companion discussion topic for the original entry at https://kni.me/w/wWrebA_HNv4hHDDG

paolotamag · May 31, 2019, 9:21am

Great job @Maarit !
Finally we have a simple brand new workflow showing how to score classification models!
I love the fact that now you can open an interactive node with the new Scorer (JavaScript) node !

Maarit · May 31, 2019, 10:00am

Thanks @paolotamag! Yes - those interactive views are nice on their own like this, but even more powerful when combined with other views in components. I am currently working on more of these examples, and they will all be shared here via the Workflow Hub!

DemandEngineer · November 6, 2020, 9:10pm

Related to this topic of evaluating model performance.

Feature request: Scorer JavaScript

testing classification models (different types or just different features etc) to see which perform the best on the data you have
would like to be able to connect multiple Predictors into the Scorer… and the Scorer retains the input table name or Node name… that way the Lift and ROC curve can be plotted on one graph so you can see performance across models

ScottF · November 9, 2020, 6:32pm

You can currently do this with the new Binary Classification Inspector node - check out some of the example workflows featuring it on the Hub!

RajiMouli2 · October 4, 2022, 3:31am

Hello,

I have tried using the feature scorer but it does not allow to connect to more than one predictor and so am unable to compare. please guide. Thanks.

ScottF · October 4, 2022, 2:00pm

I think there might be some language confusion going on here, as I’m not really sure what you mean by the question.

There are two scorer nodes you can use for classification problems: the Scorer and the Scorer (JavaScript). They both operate in the same way, by comparing to the true label to the predicted label of the model. Both labels are assumed to be included in the same dataset as generated by the upstream predictor node.

Is your question about comparing the results of different models, each with their own predictor? In that case you can get the scoring metrics for each model, and combine them into a single table for comparison.

But neither case requires a scorer node with multiple inputs.

RajiMouli2 · October 4, 2022, 2:58pm

Thanks very much.

I am referring to the model comparison report that has been generated above.

ScottF · October 4, 2022, 3:45pm

That report is from a software package other than KNIME, I’m not sure what exactly. I believe @DemandEngineer was using it as an example of the kind of thing he’d like to produce.

RajiMouli2 · October 4, 2022, 4:19pm

Thanks.

I presumed thay the report was generated using Knime and hence the query.

Please ignore my question and Thanks again for the very prompt responses.

Raji