Correct flow for performing partitioning with repetitions

I’m replicating an experimental study performed over the old implementation of a supervised algorithm for a descriptive task. However, the study describes the experiments as using a “10-fold cross validation” with 3 repetitions. As the new version has been implemented as a KNIME node, I’m creating a flow that does this. The node outpus a custom table with the descriptors and it’s metrics (computed inside the node). I can use X-Partitioning but X-Aggregator takes the target class and the predicted class as inputs and automatically computes the results for the cross validation process.
So, I’m implementing a custom partitioning process with two loops and the partitioning node:

Would this be the correct way or am I reinventing the wheel for something that KNIME already has?

Hi @xandor19,

Although I didn’t quite understand why you are trying to “reinvent the wheel”, if you want to do that, for 10-fold you need to have 10 partitions and then use each as the test set and the other partitions as the train set in each loop iteration. So the test sets wouldn’t have overlaps. Using the Partitioning node after the Counting Loop Start wouldn’t do that for you. Instead, you can use the Chunk Loop Start and set it to 10 “chunks” then use a Reference Row Filter to “Exclude” each chunk from the main data set. In this case, the output of the chunk loop start would be your test set and the output of the Reference row filter would be your train set:

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.