Problems with H2O integration

Hi,

I have recently been using the H2O nodes to build a model that is subsequently deployed on the KNIME Server. During this process I have encountered a number of problems with the way the integration is working - some of which may be due to my understanding, and some may be bugs(?):

Platform: KNIME 3.6.1 (Windows 10, 64-bit)

  • Table to H2O node - this loses the input Row IDs
  • H2O Partitioning node - unlike the table partitioning node, this renumbers the rows in each partition. Also, subsequent H2O to table nodes report “DataSpec generated by configure does not match spec after execution” errors, even though the nodes themselves are marked as executed (ie green light plus error cross on nodes)
  • Workflows saved onto the KNIME server with H2O nodes in an executed state do not load successfully - the following sort of errors are reported:

Errors loading workflow ‘model’: Status: DataLoadError: model 0 loaded with error during data load
Status: DataLoadError: model 0
Status: DataLoadError: H2O Local Context 0:419
Status: DataLoadError: Unable to load port content for node “H2O Local Context”: null
Status: DataLoadError: Loading model internals failed: null

  • Workflows on the server opened locally do not allow saving files back to the server with the H2O MOJO Writer node. For example, if I try to write to “knime://knime.workflow/…/GLM.mojo” the node adopts the IDLE state ok, but on executing I get a null pointer exception. My current workaround is to write to a local temporary location and then use the Explorer Writer to write to the server path.

Any help on the above would be greatly appreciated!

Kind regards

James

1 Like

Hey @James_Davidson,

Thanks for your feedback. We’ll try to reproduce the problems and comment here again.

Cheers,

Christian

Hi @James_Davidson,
I have looked into the problems you are describing, thanks for reporting those.
Regarding the first two points: We are aware of this issue but it’s not that easy to solve form a technical view since h2o manages row IDs different than we do.

I tried to reproduce the errors you are describing in the last two points but was not able to.

Regarding point 3: I uploaded a workflow with executed h2o nodes in it to a KNIME server and tried to open it in KNIME AP (in the way that a temporary copy is downloaded) and tried to run it in the webportal. Both worked for me. Where did you load the workflow?

Regarding point 4: What exactly do you mean with opening it locally? Downloading a temporary copy of the workflow and execute it?

Kind regards,

Simon

Hi Simon,

For point 3 - if I open the server-stored workflow locally in KNIME Desktop (in the way that a temporary copy is downloaded) it loads fine. However if I select the same workflow in the webportal I see the DataLoadErrors. It just occurred to me that we are actually still running 4.6.1 on the server - I just now tried the same on 4.7.1 (which matches my local executor version) and things work as expected. So I guess it was a node version mismatch(?)

For point 4 - when I say I open the server workflow locally, I mean the same as above - ie double-click and get a temporary copy locally. I have also reproduced the problem with a simple test workflow on version 4.7.1:

h2o_test.knwf (92.3 KB)

What I see is that if this workflow is saved to the server (4.6 or 4.7) and opened locally (by double-clicking) then the Table Writer will write to the server but the H2O Writer fails with null pointer exception.

Kind regards

James

Hi James,

I was able to reproduce point 4 with your workflow. Indeed, this is a bug on our side and we need to fix it, thanks for reporting it!
By then, I think you need to need to continue using your workaround.

Kind regards,

Simon

Hi @James_Davidson,

Just FYI: This bug has been fixed with version 3.6.2. No need to use a workaround anymore :slight_smile:

Cheers,
Simon

1 Like

Hi all,

I am receiving a similar error msg as above although I am using the latest versions: KNIME 3.7.1 and the AWS KNIME Server (version 4.7.2).

Errors loading workflow ‘model’: Status: DataLoadError: model 0 loaded with error during data load
Status: DataLoadError: model 0
Status: DataLoadError: H2O Local Context 0:211
Status: DataLoadError: Unable to load port content for node “H2O Local Context”: No context handler with id org.knime.ext.h2o.local.v32202.H2OLocalContextV32202 found. Please check if you have the corresponding H2O version installed.
Status: DataLoadError: State has changed from EXECUTED to CONFIGURED

Local (open the server workflow locally, ie double-click and get a temporary copy locally) execution does work without any errors. Executing the workflow in the web portal throws the above error.

H2O should be installed because the H2O predictor works on the web portal.

Do you have any idea about it?

Thanks a lot.

Kind regards,
Katrin

Hi Katrin,

Thanks for the question. The problem arises since the KNIME Server executor on that AWS instance is part of the 3.6 release line. The workflow that you are trying to execute is created with a newer version of the KNIME Analytics Platform (3.7.1).

You have a few options.

  1. Open the workflow in KNIME Analytics Platform 3.6.x, test it, save it and upload.
  2. Upgrade the KNIME Server (and executor) to the latest version. That’s detailed here: https://docs.knime.com/2018-12/aws_marketplace_server_guide/index.html#_update_knime_server_feature_version
  3. Wait until we post the updated KNIME Server 4.8 AMI onto the AWS Marketplace. Expected fairly soon.

As a side note:
It is possible to change the setting on the KNIME Server (knime-server.config file):
com.knime.server.executor.reject_future_workflows=true

That will mean that any workflows created with a newer version of KNIME Analytics Platform cannot be executed on the KNIME Server (and users will get a message to that effect when trying to execute).

Full details here: https://docs.knime.com/2018-12/server_admin_guide/index.html#knime-server-configuration-file

Best,

Jon

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.