Combination of ROC curves and AUC comparison

MarcB · August 9, 2020, 12:22pm

Good evening,

I am using three ML algorithms (RF, SVM and logistic regression with LASSO) for a binary classification problem. I use the AUC as the model assessment parameter.

How can I:

Plot the 3 ROC curves in a single graph?
Statistically compare the AUCs of the 3 ROC curves?

Thank you,
Marc

mlauber71 · August 9, 2020, 1:18pm

This example might help you

Also the description of the node itself

You may compare the ROC curves of several trained models by first joining the class probability columns from the different predictors into one table and then selecting several column in the column filter panel.

MarcB · August 9, 2020, 1:59pm

Thank you, @mlauber71!

MarcB · August 20, 2020, 10:20pm

Good night,

I have used three models (random forest, logistic regression and SVM with variable selection) for a binary classification task, using AUC as the measure of model performance. I have used 10-fold crossvalidation with each model and I have also used a loop to tune some hyperparameters (variance in logistic regression with Laplace regularization and cost and sigma in SVM). The workflow is attached below.

Example.knwf (146.1 KB)

I want to:
(1) Generate a single ROC graph with the 3 ROC curves (one for each model, and for each model the one with the highest AUC after optimal hyperparameter tuning)
(2) Obtain a p-value for the comparison of the 3 AUCs

I read the previous workflow but it is not clear to me how can I generate these results after crossvalidation and with looping for hyperparameter tuning. Any suggestions are appreciated.

Thank you,
Marc

MarcB · August 21, 2020, 10:53pm

Anyone has any suggestions?

Thank you,
Marc

ScottF · August 24, 2020, 7:15pm

Hi @MarcB -

In your workflow, you are generating a ROC curve for each model and then attempting to join them. But if you carefully look at the outputs from the data ports on those nodes, you’ll see that what is available is the overall AUC.

You might want to take a look at this workflow that shows how you can plot multiple ROC results on a single curve:

Here, the probabilities for each model are generated, and those columns are joined into a single table prior to display in a ROC curve node.

You may also want to consider using the new Binary Classification Inspector node to display an interactive view of multiple model results - both ROC curves as well as other metrics, and a confusion matrix too.

As to your second question, I think you would have to calculate this manually - as far as I know the p-value is not currently an available output for the individual AUCs.

MarcB · August 24, 2020, 9:02pm

Thank you @ScottF, I will read these carefully.

Best regards,
Marc

system · February 23, 2021, 9:03am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.