Batch Execution on Unix like systems - File System permission

I made some elaboration runnig KNIME in a batch mode today. I’ve experienced a strange error while my workflow was starting.
ERROR main BatchExecutor IO error while loading the workflow: No workflow directory at /tmp/BatchExecutorInput30352

After I shifted from invoking the workflow from a knwf file using a workflowFile parameter to invoking it from a directory using a workflowDir parameter, the workflow started successfully.

It makes me believe KNIME first extract workflow file into /tmp directory. The user I used for invoking the workflow on the virtual linux machine wasn’t allowed to create subdirectories within /tmp directory there.

I haven’t found it would be configurable where KNIME is expected to extract knwf file before it invokes a workflow. It’d be nice to have such an option in case /tmp directory is not accessible, not being forced to extract the workflow on one’s own.

The whole context is the workflow should run in a client’s linux machine with a system user who has got minimal system privileges. Giving write privileges in the /tmp directory to the user is more than we can get.

The batch executor extracts the workflow archive into the standard temp directory that is set in the TEMP environment variable. If that directory isn’t writable a lot more problems can occur down the line. Changing the TEMP variable should solve the issue. Alternatively you can also set the system property java.io.tmpdir in the knime.ini.

3 Likes

Hi @thor,

thank you for your valuable advice.

So far it look neither setting env var TEMP nor setting -Djava.io.tmpdir works. KNIME fiercely keeps claiming: Setting KNIME temp dir to “/tmp” despite the fact I’ve added system property and set the TEMP environment variable.

knime.ini:

[ders@phaeton.mnd.ders.cz ~]# cat /opt/ders/knime_4.1.2/knime.ini 
-startup
plugins/org.eclipse.equinox.launcher_1.4.0.v20161219-1356.jar
--launcher.library
plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.1.551.v20171108-1834
-vm
plugins/org.knime.binary.jre.linux.x86_64_1.8.0.202-b08/jre/bin
-vmargs
-server
-Djava.io.tmpdir=/opt/ders/temp
-Dsun.java2d.d3d=false
-Dosgi.classloader.lock=classname
-XX:+UnlockDiagnosticVMOptions
-XX:+UnsyncloadClass
-XX:+UseG1GC
-Dsun.net.client.defaultReadTimeout=0
-XX:CompileCommand=exclude,javax/swing/text/GlyphView,getBreakSpot
-Xmx2048m
-Dorg.eclipse.swt.internal.gtk.disablePrinting
[ders@phaeton.mnd.ders.cz ~]# 

Shell script updated:
[ders@phaeton.mnd.ders.cz ~]# cat ~/run-test-workflow
#!/bin/bash

workflow_name=test_workflow_01
run_identifier=knime_wf_${workflow_mame}_run_$(date +%Y-%m-%d-%H-%M-%S)

export TEMP=/opt/ders/temp

current_dir="$PWD"
nohup /opt/ders/knime  \
	-nosplash \
	-application org.knime.product.KNIME_BATCH_APPLICATION \
	-workflowFile="/home/ders/sify_int/${workflow_name}.knwf" \
	-workflow.variable=delay_in_sec,3,int \
	-workflow.variable=correlation_id,correlation_id_not_set,String \
	-workflow.variable=job_name,"unnamed_job",String \
        -credential="sify_new_vavdev;*****;*****" \
	-reset \
	-preferences=/home/ders/phaeton.epf \
	-consoleLog \
	-destDir=$current_dir/out/${run_identifier}-out/ \
	-vmargs -Dorg.knime.core.maxThreads=2 -Xmx4g > ${run_identifier}.log 2>&1 &

[ders@phaeton.mnd.ders.cz ~]# 

Before I dive deeper into elaboration & investigation I need to be sure, is that impossible KNIME would ignore these settings? Please note it runs on Linux in batch mode.

YES! I’ve got it.

The solution is not setting a -Djava.io.tmpdir system property in knime.ini not even setting TEMP environment variable before the KNIME workflow invocation.

It is setting the -Djava.io.tmpdir as system property when a workflow is being invoked in batch mode. This is the command that ended up successfully:

#!/bin/bash

workflow_name=test_workflow_01
run_identifier=knime_wf_${workflow_mame}_run_$(date +%Y-%m-%d-%H-%M-%S)


current_dir="$PWD"
nohup /opt/ders/knime  \
        -nosplash \
        -application org.knime.product.KNIME_BATCH_APPLICATION \
        -workflowFile="/home/ders/sify_int/${workflow_name}.knwf" \
        -workflow.variable=delay_in_sec,3,int \
        -workflow.variable=correlation_id,correlation_id_not_set,String \
        -workflow.variable=job_name,"unnamed_job",String \
        -credential="sify_new_vavdev;*****;*****" \
        -reset \
        -preferences=/home/ders/phaeton.epf \
        -consoleLog \
        -destDir=$current_dir/out/${run_identifier}-out/ \
        -vmargs -Djava.io.tmpdir=/opt/ders/temp -Dorg.knime.core.maxThreads=2 -Xmx4g > ${run_identifier}.log 2>&1 &
1 Like

You can set it in knime.ini but as soon as you specify -vmargs on the command line all VM arguments from knime.ini are ignored (this is how Eclipse works).

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.