How to generate Accuracy Curve by Epoch to detect Overfitting with Training & Validation Data

Dear sir

I would like to generate the Accuracy Curve by Epoch to detect Overfitting with Training & Validation Data after the MLP Learner model node. Can you help?
The sample chart is saved as below/

Thanks
Lawson

Hello Lawson,

you could use a loop where you increase the Maximum number of iterations via flow variables in each iteration and score the model on the training and validation set.
Once the loop is finished you will have the test performance for each loop iteration, which is essentially what you are looking for. Note that it is important to set the Use seed for random initialization to ensure that each loop iteration starts from the same initialization.

However, we recently integrated the deep learning framework Keras in KNIME and its learner allows you to monitor the training and test performance via a view during training. It is even possible to stop the training process by hand once the test performance stops increasing (what is marked as early stopping epoch in your diagram).
Granted, the deep learning integration is more complex than the simple MLP learner, but it let’s you build networks with KNIME nodes and gives you much more possibilities in terms of configuration than the MLP Learner.

If you run into any problems, feel free to post them here, so we can help you out.

Cheers,
nemad

Hi nemad

Thanks for suggestion.

For Keras, can you share some examples or workflow illustrations in monitoring training during training?

Thanks
Lawson

1 Like

Hi Lawson,

in order to get Keras up and running in KNIME, I would suggest our corresponding webpage https://www.knime.com/deeplearning/keras

The usage of the Learning Monitor is simple:
You open it by opening the view of the learner (right click the node -> View: Learning Monitor) and then you will see the training progress in real time.
Once you are satisfied with your accuracy/loss you can hit the Stop Learning button and the training is stopped. Note that this is not the same as canceling the execution because the node will output the current network at its outport.
The screenshot below shows how the monitor looks in KNIME (the red curve corresponds to the training data and the blue to the test/validation data).

Cheers,
nemad

1 Like

Thanks for your clarification.