Hey everyone,
I’m trying to figure out an issue I have with the keras network learner node from the keras deep learning extension. The node does not produce any errors, but keeps running for very long, reaching 40% progress after a day, which is in no proportion to the network size, even on cpu.
To isolate the issue I downloaded this simple example workflow Simple Example for Binary Classification with Keras – KNIME Community Hub Unfortunately i experience the same issue as with my own workflow.
The issue does not produce any node errors, so I’m struggling how to debug the issue. I have tried cpu & gpu conda environments created directly from within the python deep learning preferences.
When activating the gpu keras environment I discovered that libcudnn:7 was missing when testing with import tensorflow as tf print('GPU name: ', tf.config.experimental.list_physical_devices('GPU'))
However, while installing the necessary library allows to print the correct gpu it has not helped with the node issue.
Any ideas on how to debug this issue are much appreciated.
Edit: I forgot to mention that when monitoring resource usage there is a short uptick in cpu activity directly after the node is started, but this dies down quickly while the environment’s python threads stay alive.