No luck with Deeplearning4j on 3.3 or GPU

The new DL4J learner on 3.3 does not recognize images as the input. I keep getting "No input columns selected" and don't have them as an option in the feature column selection.

Things run fine on 3.2.1 with exception of when I enable GPU. Then I get an error "Could not initialize class org.nd4j.linalg.factory.Nd4j". I've tried with both CUDA 7.5 and CUDA 8.0 installed.

Has anybody had any luck?

Hi,

are you working with the nightly build? Did you install the KNIME Image Processing - DL4J Extension, too? See: https://tech.knime.org/deeplearning4j-imageprocessing for more infos about this.

Best,

Christian

Hi Christian,

I apologize. You are correct. I had installed the wrong integration version. The issue of image as an input is now fixed.

I still get an error though with GPU enabled:

ERROR DL4J Feedforward Learner (Classification) 0:41       Execute failed: Could not initialize class org.nd4j.linalg.factory.Nd4j

I am currently using CUDA 7.5 (tested via nvcc - V comand line). I have also tested running CUDA 8.0

Any thoughts?

Thank you!

Hi,

sorry fou the trouble. Could you give me some more information about the problem? Which OS are you running? Which kind of GPU are you using? What's the output of ncvv --V?

Also, could you maybe attach the KNIME log file here?

Thanks for your help.

Best,

David

OS: Windows 10
GPU: Geforce GTX 950

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sat_Sep__3_19:05:48_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

I am running the example: 05_Network_Example_Of_A_Simple_Convolutional_Net

Here is the log info:

2016-12-19 22:56:45,272 : DEBUG : KNIME-Worker-4 : DL4J Feedforward Learner (deprecated) : DL4J Feedforward Learner (deprecated) : 0:14:8 : reset
2016-12-19 22:56:45,331 : ERROR : KNIME-Worker-4 : DL4J Feedforward Learner (deprecated) : DL4J Feedforward Learner (deprecated) : 0:14:8 : Execute failed: ("ExceptionInInitializerError"): null
2016-12-19 22:56:45,341 : DEBUG : KNIME-Worker-4 : DL4J Feedforward Learner (deprecated) : DL4J Feedforward Learner (deprecated) : 0:14:8 : Execute failed: ("ExceptionInInitializerError"): null
java.lang.ExceptionInInitializerError
    at org.deeplearning4j.nn.conf.NeuralNetConfiguration$Builder.seed(NeuralNetConfiguration.java:557)
    at org.knime.ext.dl4j.base.mln.MultiLayerNetFactory.createListBuilderWithLearnerParameters(MultiLayerNetFactory.java:315)
    at org.knime.ext.dl4j.base.mln.ConvMultiLayerNetFactory.createMlnWithLearnerParameters(ConvMultiLayerNetFactory.java:88)
    at org.knime.ext.dl4j.base.mln.MultiLayerNetFactory.createMultiLayerNetwork(MultiLayerNetFactory.java:210)
    at org.knime.ext.dl4j.base.nodes.learn.feedforward.FeedforwardLearnerNodeModel.execute(FeedforwardLearnerNodeModel.java:176)
    at org.knime.ext.dl4j.base.nodes.learn.feedforward.FeedforwardLearnerNodeModel.execute(FeedforwardLearnerNodeModel.java:1)
    at org.knime.core.node.NodeModel.executeModel(NodeModel.java:566)
    at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1128)
    at org.knime.core.node.Node.execute(Node.java:915)
    at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:561)
    at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
    at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:179)
    at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:110)
    at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
    at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
    at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Parameter 'directory' is not a directory
    at org.nd4j.linalg.jcublas.JCublasBackend.isAvailable(JCublasBackend.java:46)
    at org.nd4j.linalg.factory.Nd4jBackend.load(Nd4jBackend.java:122)
    at org.nd4j.linalg.factory.Nd4j.initContext(Nd4j.java:5182)
    at org.nd4j.linalg.factory.Nd4j.<clinit>(Nd4j.java:167)
    ... 19 more
Caused by: java.lang.IllegalArgumentException: Parameter 'directory' is not a directory
    at org.apache.commons.io.FileUtils.validateListFilesParameters(FileUtils.java:545)
    at org.apache.commons.io.FileUtils.listFiles(FileUtils.java:521)
    at org.apache.commons.io.FileUtils.listFiles(FileUtils.java:691)
    at org.apache.commons.io.FileUtils.iterateFiles(FileUtils.java:710)
    at org.nd4j.linalg.util.Paths.nameExistsInPath(Paths.java:56)
    at org.nd4j.linalg.jcublas.JCublasBackend.canRun(JCublasBackend.java:53)
    at org.nd4j.linalg.jcublas.JCublasBackend.isAvailable(JCublasBackend.java:43)
    ... 22 more
2016-12-19 22:56:45,405 : DEBUG : KNIME-Worker-4 : WorkflowManager : DL4J Feedforward Learner (deprecated) : 0:14:8 : DL4J Feedforward Learner (deprecated) 0:14:8 doBeforePostExecution
2016-12-19 22:56:45,489 : DEBUG : KNIME-Worker-4 : NodeContainer : DL4J Feedforward Learner (deprecated) : 0:14:8 : DL4J Feedforward Learner (deprecated) 0:14:8 has new state: POSTEXECUTE
2016-12-19 22:56:45,515 : DEBUG : KNIME-Worker-4 : WorkflowManager : DL4J Feedforward Learner (deprecated) : 0:14:8 : DL4J Feedforward Learner (deprecated) 0:14:8 doAfterExecute - failure
2016-12-19 22:56:45,544 : DEBUG : KNIME-Worker-4 : NodeContainer : DL4J Feedforward Learner (deprecated) : 0:14:8 : DL4J Feedforward Predictor (deprecated) 0:15:9 has new state: CONFIGURED

Thank you for your help!

Hi,

thank you very much for the information. It would be great if you additionally could provide me with the full knime.log file and attach it to this thread. It is located in your workspace folder under '\.metadata\knime'.

Best,

David

I think I have a similar problem.

I am trying to get any of the 04_Analytics 14_Deep_Learning examples to work.

Through "Install KNIME Extensions", I have installed everything found with "deeplearning4J".

But when I try to execute 02_Basic_Learner_View Tutorial, I get an error "ERROR DL4J Feedforward Learner (Classification) 3:14       Execute failed: Could not initialize class org.nd4j.linalg.factory.Nd4j"

What can I do?

The knime log file is not attached as it is. I have deleted the log from the earlier days. 

 

 

 

 

Hi oncron,

thanks for the log file. Are you also trying to run on GPU? Furthermore, what Windows version are you running on? You could try to run KNIME in administrator mode because it seems some dll can't be accessed on your machine.

Best,

David

Hi, thanks for the suggestions! "Running as administrator" does seem to solve the problem. 

Do you know if there is a 32 bit version DL4J?

DJ4J seems to require a 64 bit version of Java:

Please make sure you have a 64-Bit version of java installed

According to  https://deeplearning4j.org/quickstart.

I also get the error "Could not initialize class org.nd4j.linalg.factory.Nd4j" and I have Java 8, 64-bit installed:

$ java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

not sure what else to do...

Hey M42,

can you increase the off-heap memory size in the KNIME -> DL4J preferences?

Best,

Christian

Dear Christian,

now, after I increased Off-Memory to 4000MB, I am getting the error

ERROR Word Vector Model Reader 0:70:142   Execute failed: Cannot allocate new IntPointer(8): totalBytes = 3G, physicalBytes = 4G

Also, if I am loading the german.model https://tubcloud.tu-berlin.de/s/dc4f9d207bcaf4d4fae99ab3fbb1af16/download from https://devmount.github.io/GermanWordEmbeddings/ the "Word Vector Model Reader" Node takes a long time to load. Is that normal? It is loading for a couple of minutes now and isn't finished yet.

 

 

Can you further increase the memory?

Finally the node "finished" loading german.model with

 

ERROR Word Vector Model Reader 0:70:142   Execute failed: java.io.IOException: Stream Closed

To how much should I increase?

I increased to 8000MB, but the german.model still gives the same error. The google model is loading for 40 Minutes now. I am stopping the process. I think there is a bug somewhere.

Hi M42,

the DL4J library only supports a limited number of formats which currently include in KNIME: the models saved by the Word Vector Writer node, models in plain text format and some binary models. I don't know the exact format of the model you want to read so it could be that it is not supported. However, the google model should work. Unfortunately, it can take a long time to read because it is so big (maybe even an hour).

Cheers

David