Bug in RProp MLP Predictor

Hi,

There appears to be a bug in the RProp MLP Predictor node, specifically related to the order of input colums for the prediction. It appears the order of columns must be the same as the input data used to train the RProp MLP learner.

Here is my setup:

  1. Workflow A to read in a training/test dataset and save the MLP Learner model via PMML writer
  2. Workflow B to read in a prediction dataset, read the model via PMML reader, and output a prediction

 

I have noticed the following:

  1. Prediction on training/test dataset is good
  2. Prediction on the prediction dataset seemed way off. Several hours of debugging later, I realized the input column order (of Double values) was different. There were also a few additional columns (unused)
  3. Added a "Column resorter" node and fed in prediction dataset into the MLP redictor. Problem solved.

This appears to be a major flaw. Please let me know if you are able to recreate this issue, otherwise I am happy to sanitize and share my dataset and workflows.

Regards,

Jawahar

Wow, I just verified it. I will immediately open a bug report.

Noticed similar odd behaviour when cross validitating with libSVM. When sorting the activity column after the partitioning it's fine. Otherwise the probabilities don't match the predicted class. 

thor,

Thanks for verifiying and opening a bug report. Since a similar issue was found by swebb in libSVM, I'm wondering if the issue is in the way data is read into the generic "model apply" node structure. Anyway, would be good to get this resolved.

I don't see this issue in LibSVM. No matter how I sort the columns for the predictor, the output is always the same.

The MLP issue is resolved in 2.9.2.