K Nearest Neighbor Assistance

abarca321 · April 4, 2024, 8:25pm

Hello Knime Community,

I need assistance with troubleshooting my algorithm. When trying to connect the Partitioning to the K Nearest Neighbor, I presented with an error message "The Dialog cannot be opened for the following reason: No column in spec compatible to “NominalValue”.

I am trying to do the following steps:

Partitioning Node: Using the Partitioning node to split the data into training and test sets. Set the partitioning to 70% Training and 20% Test for records 1-300.
k-Nearest Neighbor Node: Creating a k-Nearest Neighbor node and connect it to the Training data output from the Partitioning node.
Scorer Node: Connect a Scorer node to the k-Nearest Neighbor node. This will evaluate the model’s accuracy using the Test data from the partition.
Table Node: Attach a Table node to the second output port of the Scorer node to view the accuracy results.
Run Model for k Values: Configure the k-Nearest Neighbor node to run the model for k values ranging from 3 to 6.

Would someone be kind enough to check that I have not missed a step?, I have been working on this item for the past week and have search through every single source without a solution.

Thank you kindly,
Justin

ArjenEX · April 4, 2024, 9:34pm

Hi @abarca321

Pretty hard to help you without knowing what kind of data you are working with. The error is somewhat self-explanatory: missing the correct type of values.

I’d say have a look on the Community Hub, there are a lot of reference examples that you can use and compare to your situation

rfeigel · April 4, 2024, 9:34pm

What data type is your attribute column?

hmfa · April 4, 2024, 9:40pm

Hi, @abarca321

Have a look at your data.
I think only numeric columns and the Euclidean distance are used in this implementation.

Br

abarca321 · April 5, 2024, 1:25pm

Thank you all for your prompt responses, the data source is coming from the following table attached.

rfeigel · April 5, 2024, 2:21pm

Could you please upload some sample data? Its a lot more help than a screenshot.

abarca321 · April 5, 2024, 3:30pm

Hi rfeigel,

Sure thing, please see attached sample data
BIT-445-RS-Failure-Rate.csv (7.1 KB)

rfeigel · April 5, 2024, 3:46pm

Change the target to a string. That should work.

abarca321 · April 5, 2024, 4:22pm

Hi rfeigel,

I am not sure i understand your response. I have multiple nodes for “string” also which target are you referring to?

rfeigel · April 5, 2024, 11:15pm

Sorry. I wasn’t as clear as I could have been. By “target” I meant the column you’re trying to predict, i.e. 'failure." I cleaned up your data and changed the workflow some. This workflow produced 94% accuracy. I assumed the “model” column is a legitimate predictor. You can play with the k if you want.

abarca321 · April 7, 2024, 7:27am

Hi rfeigel,

Thank you for the example that you have shared, the last issue I am having is with the scorer, for some reason I keep getting an error with the scorer. Do you think I ran one of the nodes incorrectly? <Error Message: Execute Failed:Index 0 out of bounds for length 0>

rfeigel · April 7, 2024, 9:22pm

Run the workflow I posted without making any changes. Its configured properly. I don’t know where your problem is since I can’t see your workflow.

hmfa · April 7, 2024, 10:11pm

Hi @abarca321 .
Maybe there is a problem with your column name?

Br

rfeigel · April 7, 2024, 11:10pm

I removed it since its not a legitimate predictor. As I said - my workflow works.

rfeigel · April 11, 2024, 9:38pm

Did my workflow solve your issue?

system · July 10, 2024, 9:38pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.