Keras learner keeps crashing

Hello,

Im still very new to deep learning and especially Keras which is why my workflow might not make too much sense to some of you. Basically I just recreated an easy network to test with based on this (https://www.knime.com/deeplearning/keras). Im trying to train the model with around 2000 rows of data containing mostly Int variables like Day, Month ID, Year and sold items on that day. This is how my workflow looks after preparing the data:

And this is the data I get from the Knime Log upon execution of the learner

2020-06-05 14:41:26,400 : WARN : main : : Node : Keras Convolution 1D Layer : 0:370 : The tensor at port is of type int which is not among the supported types [float, double]
2020-06-05 14:41:38,984 : WARN : main : : Node : Keras Convolution 1D Layer : 0:370 : The tensor at port is of type int which is not among the supported types [float, double]
2020-06-05 14:44:13,506 : WARN : KNIME-Worker-40-Keras Network Learner 0:360 : : DLKnimeNetworkTrainingInputPreparer : Keras Network Learner : 0:360 : The number of rows of the input training data table (1789) is not a multiple of the selected training batch size (100). Thus, the last batch of each epoch will continue at the beginning of the training data table after reaching its end. You can avoid that by adjusting the number of rows of the table or the batch size if desired.
2020-06-05 14:44:13,507 : WARN : KNIME-Worker-40-Keras Network Learner 0:360 : : DLKnimeNetworkValidationInputPreparer : Keras Network Learner : 0:360 : The number of rows of the input validation data table (537) is not a multiple of the selected validation batch size (100). Thus, the last batch of each validation phase will continue at the beginning of the validation data table after reaching its end. You can avoid that by adjusting the number of rows of the table or the batch size if desired.
2020-06-05 14:44:18,041 : WARN : Thread-81 : : PythonKernel : Keras Network Learner : 0:360 : /Users/samueldittmann/opt/anaconda3/envs/py3_knime_dl/lib/python3.6/site-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually.
2020-06-05 14:44:23,943 : ERROR : KNIME-Worker-40-Keras Network Learner 0:360 : : DLKerasLearnerNodeModel : Keras Network Learner : 0:360 : java.lang.Exception: Failed to receive message from Python or forward received message.
2020-06-05 14:44:23,943 : ERROR : KNIME-Worker-40-Keras Network Learner 0:360 : : Node : Keras Network Learner : 0:360 : Execute failed: An error occured during training of the Keras deep learning network. See log for details.
java.lang.RuntimeException: An error occured during training of the Keras deep learning network. See log for details.
at org.knime.dl.keras.base.nodes.learner.DLKerasLearnerNodeModel.handleGeneralException(DLKerasLearnerNodeModel.java:720)
at org.knime.dl.keras.base.nodes.learner.DLKerasLearnerNodeModel.executeInternal(DLKerasLearnerNodeModel.java:696)
at org.knime.dl.keras.base.nodes.learner.DLKerasLearnerNodeModel.execute(DLKerasLearnerNodeModel.java:303)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:571)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1236)
at org.knime.core.node.Node.execute(Node.java:1016)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:557)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:218)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:124)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:334)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:210)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.io.IOException: java.lang.Exception: Failed to receive message from Python or forward received message.
at org.knime.dl.python.core.DLPythonAbstractCommands.trainNetwork(DLPythonAbstractCommands.java:519)
at org.knime.dl.python.core.training.DLPythonAbstractNetworkTrainingSession.trainInternal(DLPythonAbstractNetworkTrainingSession.java:177)
at org.knime.dl.core.training.DLAbstractNetworkTrainingSession.run(DLAbstractNetworkTrainingSession.java:273)
at org.knime.dl.keras.base.nodes.learner.DLKerasLearnerNodeModel.executeInternal(DLKerasLearnerNodeModel.java:689)
… 14 more
Suppressed: org.knime.dl.core.DLUncheckedException: An exception occured while cleaning up Python. Cause: Failed to clean up Python. See log for details.
at org.knime.dl.python.core.DLPythonDefaultContext.close(DLPythonDefaultContext.java:228)
at org.knime.dl.python.core.DLPythonAbstractCommands.close(DLPythonAbstractCommands.java:536)
at org.knime.dl.python.core.training.DLPythonAbstractNetworkTrainingSession.close(DLPythonAbstractNetworkTrainingSession.java:156)
at org.knime.dl.keras.base.nodes.learner.DLKerasLearnerNodeModel.executeInternal(DLKerasLearnerNodeModel.java:692)
… 14 more
Caused by: org.knime.python2.kernel.PythonKernelCleanupException: Failed to clean up Python. See log for details.
at org.knime.python2.kernel.PythonKernel.close(PythonKernel.java:1368)
at org.knime.dl.python.core.DLPythonDefaultContext.close(DLPythonDefaultContext.java:226)
… 17 more
Caused by: java.lang.Exception: Failed to receive message from Python or forward received message.
at org.knime.python2.kernel.messaging.AbstractMessageLoop.throwExceptionInLoop(AbstractMessageLoop.java:76)
at org.knime.python2.kernel.messaging.DefaultMessageReceiverLoop.loop(DefaultMessageReceiverLoop.java:98)
at org.knime.python2.kernel.messaging.AbstractMessageLoop.doLoop(AbstractMessageLoop.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Suppressed: java.lang.Exception: Failed to distribute message from Python. Cause: Message receiver loop terminated.
at org.knime.python2.kernel.messaging.AbstractMessageLoop.throwExceptionInLoop(AbstractMessageLoop.java:76)
at org.knime.python2.kernel.messaging.MessageDistributorLoop.loop(MessageDistributorLoop.java:93)
… 4 more
Caused by: java.io.IOException: Message receiver loop terminated.
at org.knime.python2.kernel.messaging.DefaultMessageReceiverLoop.receive(DefaultMessageReceiverLoop.java:82)
at org.knime.python2.kernel.messaging.MessageDistributorLoop.loop(MessageDistributorLoop.java:82)
… 4 more
[CIRCULAR REFERENCE:java.lang.Exception: Failed to distribute message from Python. Cause: Message receiver loop terminated.]
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.knime.python2.kernel.messaging.PythonMessagingUtils.readInt(PythonMessagingUtils.java:91)
at org.knime.python2.kernel.messaging.DefaultMessageReceiver.receive(DefaultMessageReceiver.java:80)
at org.knime.python2.kernel.messaging.DefaultMessageReceiverLoop.loop(DefaultMessageReceiverLoop.java:92)
… 4 more
[CIRCULAR REFERENCE:java.lang.Exception: Failed to receive message from Python or forward received message.]

You might want to convert your integers to double and try again.

Did you successfully run a sample workflow to see if all libraries are up and running?

Hey,

I just modified the Workflow and removed all the Keras stuff except of one Input layer, one dense layer and the learner. Somehow that works now but doesnt give me statisfying results :frowning:

Hmm I am not an expert in Deep Learning but you will have to invest in finding the right architecture for your problem. Another possibility would be to use H2O AutoML and just DeepLearning to see what their module does.

Maybe you read about deep learning or try some standard settings. What your test seems to have revealed is that your setup is basically working.

Hey mlauber, do you have any specific website on where to start reading about the ML modules? I found it quite hard finding information about the knime modules. Mostly its just python code on the internet.

I am still looking for useful articles on the architecture for deep learning. Common suggestion would be to find some pre-defined architectures closely matching your problem and start with them. In the case of the Keras/KNIME nodes that could mean to re-build such an architecture with the specific nodes.

I have not yet checked for more examples on the KNIME hub if they might be of any help.