we've some nodes (the Palladian TextClassifier, to be precise), which make use of a custom PortObject to hand around the model. Internally, the model is a Java class which gets serialized by a corresponding PortObjectSerializer using an ObjectOutputStream.
Now, these models can usually become quite large, which is not an issue per se, beside one annyoing thing:
When opening an existing workflow which contains the mentioned data, KNIME immediately invokes the PortObjectSerializer.loadPortObject method for each PortObject. Why is this necessary at load time? I would expect that it should be sufficient to deserialize the PortObject once its required by a node to be executed, but probably there are good reasons for deserializing the object right away, which I am not aware of?
Anyways: Is there any possibility to load such data in a more lazy way?
sorry for the late reply. One thing you could try is to inherit from `FileStorePortObject'. There you can control when the data from the filestore is actually loaded. Basically, the heavy data would be stored in a file (which KNIME manages) directly, and you can access this file as needed. Only metadata would be stored in the stream.
yeah, that sounds like exactly what I need!! Many thanks for the valuable input. I've already created a very dirty workaround for this isse in the meantime, but I'll definitly try to switch to this approach.
Do you happen to know if there are any existing sample implementations of that class that could help me get going? In my KNIME environment I cannot find any implementors of the 'FileStorePortObject'.
there are some examples on GitHub: e.g. https://github.com/knime/knime-core/blob/master/org.knime.base.treeensembles2/src/org/knime/base/node/mine/treeensemble2/model/TreeEnsembleModelPortObject.java.