I got big problems running a WF created in 2.12.2 on the new version KNIME 3.1 concerning the 'Parallel Chunk' nodes. When the 'Parallel Chunk' part of my WF was executed KNIME 3.1 got stuck (not reacting anymore).
Simplifying my WF resulted in the following WF: 'Table Creator->'Parallel Chunk Start'->'Java Snippet'->'Parallel Chunk End', with the following settings:
Java Snippet: A column 'sqaure' is appended calculated with 'out_square = c_column1 * c_column1;
Parallel Chunk End: Default Settings
Executing this WF in KNIME 3.1 takes 22 seconds - running on a Windows Laptop i7 with 8GB ram (4 GB assigned)! Moreover, it takes ~21 seconds that the 'Parallel Chunks' meta node pops up.
The same WF executed in KNIME 2.12.2 takes 1.5 seconds!
impressive figures! With a freshly booted Windws machine, my absolute figures get smaller, but the basic problem still remains.
After a reboot I'm running KNIME 2.12.2 and KNIME 3.1 in parallel with the same WF as described above - I have different workspaces for 2.12 and 3.1. Using 'Timer Info' node in each of my test runs (several runs per KNIME version) I get the following execution times per node (avg. of 10 runs):
Table Creator: 7.5 in 2.12 and 40.1 in 3.1
Java Snippet: 155 in 2.12 and 907 in 3.1
Parallel Chunk Start: 76.2 in 2.12 and 146.6 in 3.1
Parallel Chunk Endt: 521 in 2.12 and 2704 in 3.1
As said, the figes are the ratio of 'Execution time sincce Start / Nr of Executions since start'. From a high level view, the WF in 2.12. finishes 'immediately' while in 3.1 it takes more than 6 seconds, for each run!
Looking at the time values per run - compared with the sums - I found that the nodes 'Table Creator' and 'Java Snippet' in 2.12.2 performed much faster (after the very first run) than the average, while in KNIME 3.1. each run was about the average! Both Chunk-nodes showed a stable runtime, but they are much faster in 2.12 than in 3.1!
For Example:
Table Creator: After the very first run, in KNIME 2.12.2 I got values about 1 or 2 while in KINME 3.1 it is stable around 40!
Java Snippet: After the very first run, in KNIME 2.12.2 I got values about 70, while in KNIME 3.1 I have about 800
As both KNIME versions are running in parallel (the WF I executed subsequently) they have the same ressource situation - one difference is, that I have defined JAVA JDK 1.8.0_045 as the JRE in KNIME 2.12.2, while I'm using the standard JRE in KNIME 3.1.
Moreover, as my PC is a company machine a lot of 'background' services, proxies, etc. are configured! (But this applies to both KNIME versions)
would you mind send me the workflow (or uploading it here) I can forward this to two of our deep deep core developers and check if they find the problem.
could you extract us a jstack when the 3.1 workflow is running? (here Bernd how it is done on windows: https://tech.knime.org/forum/knime-general/updating-knime-0#comment-25647)
I can reproduce the behaviour mentioned above also on my private Win7 Laptop (i7, 8GB, 5GB assigned). Executing the 3.1 WF - as attached above - takes ~13 sec.
I've created the jstack trace your requested while ececuting my 3.1 WF on my private PC. Hopefully correct.
attched please find the zipped out-put file I created by calling 'jstack -l <PID>' within a FOR ... DO ...>> chunk.txt' batch-loop while running my WF as described above. Note, I started the LOOP before the WF execution so the first 5-6 stack traces will be the same, totally - I think - jstack was called about 60 times!
Furthermore I tested my 3.1 WF on a freshly installed stronger Win10 pro PC (i7, 16GB, 8GB assigned) - with the same effects: The simple WF takes ~10 seconds. Summing up, I could reproduce my results on three different Windows versions (Win 7 Home Premium, Windows 7 professional, Windows 10 Pro), on three different machines. So I'm wondering why you are not able to reproduce this behaviour.
Something seems to be wrong with your file systems. All stacktraces in which the loop is executing show that the process is waiting to write a entry to the log file, e.g.
"KNIME-Worker-11" #111 daemon prio=3 os_prio=-1 tid=0x000000002a1d6800 nid=0x1f4c runnable [0x0000000044bae000]
java.lang.Thread.State: RUNNABLE
at java.io.WinNTFileSystem.getLength(Native Method)
at java.io.File.length(File.java:974)
at org.knime.core.util.LogfileAppender.subAppend(LogfileAppender.java:186)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:160)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
- locked <0x0000000682fab9c8> (a org.knime.core.util.LogfileAppender)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
- locked <0x0000000682fade68> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.debug(Category.java:260)
at org.knime.core.node.NodeLogger.debug(NodeLogger.java:570)
at org.knime.core.node.Node.reset(Node.java:1462)
For some reasons it seems to take ages to determine the sie of the log file via a operating system call. Are you using a virus scanner that may interfer?
On my private machines I could increase the WF speed by deactivating virus scans (I'm using MS security essentials) - deactivating 'real-time protection' or defining KNIME.EXE or knime-workspace as not to be checked worked.
However, on my company machine - where I first noticed this behavior - I cannot change one of these settings. And on this machine KNIME 2.12 is much faster than KNIME 3.1! (see posts above). Assuming that the virus scan is interfering - what is in 3.1 different compared to 2.12?