Hello everybody,

I'm working with Naive Bayes. I have a column that has two values >=50 and <50.

I want to compare Naive Bayes with partitioning and without partitioning.

I don't understand the scorer confusion matrix

scorer1.png is the scorer column. How can I choose the two values >=50 and <50?

confusionmatrix.png is the confusion matrix.

I don't understand this confusion matrix. Why the accuracy is 0.

When I use partitioning node I can choose col4.

scorerpartitioning.png is the score with partitioning

Confusion matrix with partitioning has accuracy different 0.

Can somebody explain me these questions?

Thanks in advance

Carlota Vina

Hello Carlota,

The confusion matrix shows you how many records of class X have been predicted as class Y.

This means for the result "confusionmatrixpartitioning.png" that 13647 records are <=50K and have been predicted correctly, while 1208 are actually <=50K as well but have been predicted as >50K.

However, "confusionmatrix.png" shows the class distribution vs the class. Obviously a probability is neither of the classes and therefore wrong.

The Naive Bayes outputs a prediction (Default name "Prediction(NameOfTheColumn)") and optionally the normalized class distribution for each class (Default name "P(NameOfTheColumn=ClassName)").

Have a look at the workflow 002007_NaiveBayes on the example-server to see how it is correctly set up.

Please let me know if this does not answer your question completely.

Best,

Ferry

Hello,

Thanks for reply.

I don't know how to use Naive Bayes without partitioning. Could you explain me?

Thanks in advance

Carlota Vina