I have recently been using the H2O nodes to build a model that is subsequently deployed on the KNIME Server. During this process I have encountered a number of problems with the way the integration is working - some of which may be due to my understanding, and some may be bugs(?):
Platform: KNIME 3.6.1 (Windows 10, 64-bit)
Table to H2O node - this loses the input Row IDs
H2O Partitioning node - unlike the table partitioning node, this renumbers the rows in each partition. Also, subsequent H2O to table nodes report “DataSpec generated by configure does not match spec after execution” errors, even though the nodes themselves are marked as executed (ie green light plus error cross on nodes)
Workflows saved onto the KNIME server with H2O nodes in an executed state do not load successfully - the following sort of errors are reported:
Errors loading workflow ‘model’: Status: DataLoadError: model 0 loaded with error during data load
Status: DataLoadError: model 0
Status: DataLoadError: H2O Local Context 0:419
Status: DataLoadError: Unable to load port content for node “H2O Local Context”: null
Status: DataLoadError: Loading model internals failed: null
Workflows on the server opened locally do not allow saving files back to the server with the H2O MOJO Writer node. For example, if I try to write to “knime://knime.workflow/…/GLM.mojo” the node adopts the IDLE state ok, but on executing I get a null pointer exception. My current workaround is to write to a local temporary location and then use the Explorer Writer to write to the server path.
Any help on the above would be greatly appreciated!
Hi @James_Davidson,
I have looked into the problems you are describing, thanks for reporting those.
Regarding the first two points: We are aware of this issue but it’s not that easy to solve form a technical view since h2o manages row IDs different than we do.
I tried to reproduce the errors you are describing in the last two points but was not able to.
Regarding point 3: I uploaded a workflow with executed h2o nodes in it to a KNIME server and tried to open it in KNIME AP (in the way that a temporary copy is downloaded) and tried to run it in the webportal. Both worked for me. Where did you load the workflow?
Regarding point 4: What exactly do you mean with opening it locally? Downloading a temporary copy of the workflow and execute it?
For point 3 - if I open the server-stored workflow locally in KNIME Desktop (in the way that a temporary copy is downloaded) it loads fine. However if I select the same workflow in the webportal I see the DataLoadErrors. It just occurred to me that we are actually still running 4.6.1 on the server - I just now tried the same on 4.7.1 (which matches my local executor version) and things work as expected. So I guess it was a node version mismatch(?)
For point 4 - when I say I open the server workflow locally, I mean the same as above - ie double-click and get a temporary copy locally. I have also reproduced the problem with a simple test workflow on version 4.7.1:
What I see is that if this workflow is saved to the server (4.6 or 4.7) and opened locally (by double-clicking) then the Table Writer will write to the server but the H2O Writer fails with null pointer exception.
I was able to reproduce point 4 with your workflow. Indeed, this is a bug on our side and we need to fix it, thanks for reporting it!
By then, I think you need to need to continue using your workaround.
I am receiving a similar error msg as above although I am using the latest versions: KNIME 3.7.1 and the AWS KNIME Server (version 4.7.2).
Errors loading workflow ‘model’: Status: DataLoadError: model 0 loaded with error during data load Status: DataLoadError: model 0 Status: DataLoadError: H2O Local Context 0:211 Status: DataLoadError: Unable to load port content for node “H2O Local Context”: No context handler with id org.knime.ext.h2o.local.v32202.H2OLocalContextV32202 found. Please check if you have the corresponding H2O version installed. Status: DataLoadError: State has changed from EXECUTED to CONFIGURED
Local (open the server workflow locally, ie double-click and get a temporary copy locally) execution does work without any errors. Executing the workflow in the web portal throws the above error.
H2O should be installed because the H2O predictor works on the web portal.
Thanks for the question. The problem arises since the KNIME Server executor on that AWS instance is part of the 3.6 release line. The workflow that you are trying to execute is created with a newer version of the KNIME Analytics Platform (3.7.1).
You have a few options.
Open the workflow in KNIME Analytics Platform 3.6.x, test it, save it and upload.
Wait until we post the updated KNIME Server 4.8 AMI onto the AWS Marketplace. Expected fairly soon.
As a side note:
It is possible to change the setting on the KNIME Server (knime-server.config file):
com.knime.server.executor.reject_future_workflows=true
That will mean that any workflows created with a newer version of KNIME Analytics Platform cannot be executed on the KNIME Server (and users will get a message to that effect when trying to execute).