Create a table that counts TP, FP, TN, FN at different thresholds

I developed a workflow for a binary classification model, and I would like to create a table that provides confusion matrix calculations at different threshold values:

  1. Table 1 sorts the predicted probabilities in descending order and shows the ground truth next to the probability (this part is easy). something like this:
Row Truth Probability
1 1 0.995
2 1 0.991
3 1 0.875
4 0 0.811
5 1 0.73
6 1 0.71
7 0 0.649
8 1 0.602
9 0 0.577
10 0 0.522
  1. Based on Table 1, Table 2 then counts the number of TP, FP, TN, FN at different threshold values ranging from 1 to 0:
Cutoff TP FP TN FN
1.0 0 0 3 7
0.9 2 0 3 5
0.8 3 1 3 3
0.7 5 1 3 1
0.6 6 2 2 0
0.5 6 4 0 0
0.4 6 4 0 0
0.3 6 4 0 0
0.2 6 4 0 0
0.1 6 4 0 0
0 6 4 0 0

Any help would be greatly appreciated.

Hi,
You can for example use a Counting Loop Start to loop 10 times over your table. Inside the loop, you can calculate the cutoff using the Math Formula (Variable) node (e.g. just divide the currentIteration flow variable value by 10) and then you can calculate your metrics using Rule Engine and GroupBy. Then end the loop with a normal Loop End. Hope this helps!
Alexander

1 Like

@saddas I have created something similar with the help of R in a Metanode:

And also the H2O Binomial Scorer – KNIME Community Hub has a table “Gains lift” to present several thresholds to check.

Next idea could be to use the “Binary Classification Inspector” with a loop and run thru thresholds from 0.1 to 1.0

2 Likes

It could be something like this:

5 Likes

Thank you all!
@mlauber71 : I like these solutions. I also managed to develop a custom solution for this. It’s a longer workflow and not as elegant but gives a lot of flexibility.

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.