Collecting Results of Cross Validation

Hi, I have a simple dataset on which I apply M5Rules. To figure out which rules might be best, I tried to use Cross Validation consisting of 10 rules. I attached a picture of my workflow. Is there a way to collect the rules as well as the R^2 (errors) of all loops? Thanks a lot!!

 

 

 

Hi Violaw,

below a snapshot showing my solution for R randomForests - best I could come up with so far. "Java Edit Variable" appends the filename with the iteration number. Hope this helps.

Cheers
E

Actually, we could add an optional input port to the X-Aggregator that collects the model in each iteration. A third output port could then have a data table with the model from each iteration in a row. They can then be extracted with the "Cell to Model" node lateron. I will open a feature request.

And in the mean time, here is a workround based on a counting loop start for k-fold cross validation with whatever scoring statistics you want. 

 

Hi, thanks a lot for the fast response! 

The workflow above looks really complicated. I am quite new to knime so please forgive my stupid questions but

- Why do you extract variables before starting the loop? 

- What did you write into the Java Snippet?

Thanks, Viola

And thanks for the workaround! However, does it collect information about the model itself or only about the model evaluation?

Just the evaluation, but there is no reason not to take the model too.  Just add a Model to Cell node, and a Column Appender to join the model to it's scoring results.  

Hi Viola,

As for the questions about my version of the workaround:

  • Extracting variables is just to keep the sync between tem dir creation and the Xval start node. Aaron does something similar with "extract table dimensions" inside the metanode.
  • The "Java Edit Variable" node only numbers the molde files kept on disk with the following one-liner:
    return $${Stemp_path}$$ + "\\" + $${IcurrentIteration}$$ + ".model";
  • Finally, I noticed that I didn't document my metanode - expanded below for clarity.

I dislike having to use the file system in my workaround, but I like that it's minimalistic on the code side. Thorsten, making this a feature request is a great idea, thanks! :-)

Cheers,
E

Hi, thanks for the comments! I now have a working version, which I attached below :-)