1-Nearest Neighbors (KNIME) vs IB1 (WEKA)

Hi,

I would like to test and compare the nearest neighbor implementations of WEKA and KNIME. To do this I have used the following datasets:

  • abalone.data.test
  • abalone.data.train

(you can find them in the datasets section: http://www.knime.org/files/datasets.zip)

To do this I have built a couple of workflows, one with the Nearest Neighbor (KNIME) node, and the other with the IB1 (WEKA). Following, the configurations for each one:

  1. Nearest Neighbor (KNIME)

Number of neighbors to consider: 1 (K1)

Weight neighbors by distance: disabled

Results: Accuracy: 48.94%

2. IB1 (WEKA)

This node only considers 1 neighbor in the classification process (K1)

Results: Accuracy: 48.18%

I can not see the the cause of that difference. What could be the reason of these different results, using a so straightforward algorithm as K1 ?

thanks in advance

Oscar

 

 

 

Hi Oscar

 

okay this is just a random guess.

one problem in knn is how to decide the class if there is more than one nearest neighbor. e.g. you have two training patterns having the exactly same distance to the test pattern.

Typical behavior is to take the majority class out of "all" 1-nn.

But some implementations take the first 1-nn they find. other the last. 

yes, just verified it.

the weka predictor always take the first found 1-nn. The KNIME predictor takes the majority class

 

You can see this with the following data set:

1 0 a

1 0 b

1 0 b

1 0 c

 

Should be classified as b, is with weka classified as a and with KNIME as b.

Perfect !

Many thanks for your explanation!

Oscar