Verification of workflow accuracy

apretnar · January 12, 2022, 1:40pm

I am writing a section on software comparison and I wish to describe KNIME accurately, so I am turning to the community for help. I am a complete novice in KNIME. I wish to build the simplest k-fold cross-validation workflow possible with the tool.

Would this be a correct sequence of components for 10-fold cross-validation with random forest? The aim is to observe confusion matrix and ROC curve of the evaluation results.

Also, is it perhaps possible to observe misclassifications from the confusion matrix in a visualization? I.e. select misclassifications and see them exposed in the scatter plot?

Any suggestions or comments much appreciated!

temesgen-dadi · January 14, 2022, 1:07pm

Hi @apretnar,

A warm welcome to the KNIME community forum.

Yes the sequence of node you used is correct. To observe misclassifications from the confusion matrix in a visualization you can use a component with two views. You need to use the Scorer (JavaScript) node though. Once you create your component with the two visualization nodes, i.e., Scorer (JavaScript) and Scatter Plot, you can select a cell from the confusion matrix and the corresponding entries should be highlighted on the scatter plot.

Please have a look at the example workflow attached.

Best,
Temesgen
RF_CrossValidation_Misclassification.knwf (29.4 KB)

apretnar · January 14, 2022, 1:19pm

Thank you! Very much appreciated!

temesgen-dadi · January 14, 2022, 1:44pm

Hi @apretnar ,

My pleasure If the answer helped you, could you please mark it as a solution?
Thanks

system · January 21, 2022, 1:45pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.