Question about Naive Bayes

Carlota_Vina · March 30, 2016, 6:52pm

Hello everybody,

I'm working with Naive Bayes. I have a column that has two values >=50 and <50.

I want to compare Naive Bayes with partitioning and without partitioning.

I don't understand the scorer confusion matrix

scorer1.png is the scorer column. How can I choose the two values >=50 and <50?

confusionmatrix.png is the confusion matrix.

I don't understand this confusion matrix. Why the accuracy is 0.

When I use partitioning node I can choose col4.

scorerpartitioning.png is the score with partitioning

Confusion matrix with partitioning has accuracy different 0.

Can somebody explain me these questions?

Thanks in advance

Carlota Vina

ferry.abt · March 30, 2016, 8:18pm

Hello Carlota,

The confusion matrix shows you how many records of class X have been predicted as class Y.
This means for the result "confusionmatrixpartitioning.png" that 13647 records are <=50K and have been predicted correctly, while 1208 are actually <=50K as well but have been predicted as >50K.

However, "confusionmatrix.png" shows the class distribution vs the class. Obviously a probability is neither of the classes and therefore wrong.

The Naive Bayes outputs a prediction (Default name "Prediction(NameOfTheColumn)") and optionally the normalized class distribution for each class (Default name "P(NameOfTheColumn=ClassName)").

Have a look at the workflow 002007_NaiveBayes on the example-server to see how it is correctly set up.

Please let me know if this does not answer your question completely.

Best,
Ferry

Carlota_Vina · March 30, 2016, 8:47pm

Hello,

Thanks for reply.

I don't know how to use Naive Bayes without partitioning. Could you explain me?

Thanks in advance

Carlota Vina