Using H2O Random Forest With KNIME SHAP Loop

I’ve successfully used the SHAP Loop nodes with KNIME AutoML. However, I have found the standard H2O Random Forest workflow is scoring substantially better at R2, RMSE, and MAPE. I would like to integrate the H2O nodes with the KNIME SHAP nodes but can’t quite get it all wired up to run.

Here is the KNIME AutoML with SHAP portion of my workflow.

Here’s my disaster of trying to recreate this same portion of the workflow using H2O.

Yes, I am aware of the H2O inherent Variable Importance Measure, but would like to compare that to the SHAP output.

Right now the KNIME AutoML is finding KNIME Random Forest as the best model, but it’s scoring a 0.59 R2, 45,412 RMSE, and 0.059 MAPE. Whereas, the standard H2O Random Forest is at 0.76 R2, 33,948 RMSE, and 0.043 MAPE.

4

I believe the superior H2O scoring will improve what the standard RFR is feeding to the SHAP loop.

Anyone want to take a stab at how to wire H2O into the SHAP system? THANKS!

So…this looks a bit like Frankenstein, but it runs. Let me know your suggested improvements. Thanks

Hello @creedssmith,

Only change I would do is to pass the second partition of H2O partitioning node to the row sampling, the records that is used for explaining.

Here is the link to the above workflow. Hope this helps.

Best,
Keerthan

1 Like

@k10shetty1 I made that adjustment. Thanks for your help

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.