H2O Parameter Optimization

This tutorial shows how to train multiple H2O Models in KNIME using parameter optimization (grid search) and extract the optimal algorithm settings for the training of the final model. We will train Gradient Boosting Machines for binominal classification using a grid of two different GBM parameters. 1. Prepare: Load and Import data to H2O. 2. Optimization: To train models with parameter optimization, we create a Loop using the KNIME Node "Parameter Optimization Loop Start" (Analytics - Mining). In this Nodes' settings we can define the optimization grid: For this example we will optimize the GBM algorithm parameters "Number of trees" and "Max tree depth". We use brute force optimization, meaning that there will be as many iteration as there are parameter combinations defined in the Parameter Optimization Loop Start Node. The "Loop End" Node collects the scored metrics of all optimization loop iterations. In order to extract the optimal algorithm parameters, we sort the collected rows by several metrics and filter the top row. 3. Learn Models, do prediction and scoring in Parameter Optimization Loop: For each combination of parameters, a GBM Model is build by H2O using the "Number of Trees" and "Max tree depth" parameters of the corresponding loop iteration and the model accuracy metrics are scored. 4. Train final model Finally, we use the optimal parameters to predict new data.

This is a companion discussion topic for the original entry at https://kni.me/w/dvaRH0PxzULMo-uP