Threshold for classification scorer

dvoigtgodoy · September 11, 2019, 1:24pm

Hi,

After training a classification model, I can use a scorer to get the confusion matrix and associated metrics. It assumes a threshold of 0.5 to consider it to be classified as positive.
What if I want to experiment with different thresholds so I can tweak the predictions to minimize False Positives or False negatives, for instance?
Since ROC Curve has confusion matrices computed for all thresholds, I’d assume that it is possible - but I couldn’t figure how to easily output the confusion matrix corresponding to a given threshold.

Did someone else try this already?

Best,
Daniel

qqilihq · September 11, 2019, 2:04pm

I would apply a Rule Engine node with a rule such as:

$probability$ > 0.25 => "positive"
TRUE => "negative"

Set the desired threshold, here 0.25, as necessary.

– Philipp

HansS · September 11, 2019, 5:31pm

Hi @dvoigtgodoy

Have you read this KNIME blog From Modeling to Scoring: Finding an Optimal Classification Threshold based on Cost and Profit? Maybe this is a useful approach.
gr. Hans

system · March 12, 2020, 5:31am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.