Knime Parallel NodePerformance Issues

I am using a workflow to process all trips on our local bus transit system, which turns into several million rows of data.  Processing it as a single item is, obviously, too large so I have it filtered and then use loops to break it into smaller chunks.  To use multiple cores on my computer, I then have several of the loops running simultaneously, but I am running into a mysterious performance bottleneck.  When I spin up each loop after the first, the entire system becomes much less responsive.  The individual loops each appear to slow down as though they were competing for a resource, but I am not able to identify what that resource is.  The problem seems to occur most prominantly in Recursive Loops, but is also occuring in Group Loops (with sorted data), Counting Loops with a (rule filter added to function like a group loop)   I could very much use some advice on where/what the bottlneck might be.     

My system:  Windows 7 Home Premium (64 bit), Phenom II 4 cores @ 3.2Ghz, 12 GB of RAM.   KNIME's temp folder is on an SSD, but due to limited space, the windows swapfile "pagefile.sys" is on a slower HDD.  For portability, I have the workspace saved to a fast USB3 stick (FlashBench showed it averaged ~100MB/sec read and write)  

What I am seeing:  The CPU is bouncing between ~20% and ~40%, it is only using about 5-6GB of RAM and I am not seeing much I/O to any of the drives (USB, SSD or HDD).   

What I have tried: 

  • Updated Knime, and Java to the latest version with no apparent change
  • Set xmx10gb in Knime.ini with no apparent change
  • Removing all "Community Nodes" and "KNIME Labs" nodes (The workflow now only contains nodes from "IO", "Manipulation" and "Workflow Control".
  • Running the same workflow on other computers (a 4 threaded i7 system with 16G Ram and a 12 threaded xeon system with 32GB Ram).  It ran only slightly faster on these machines.
  • I have converted several of the nodes to use "Simple Streaming".  This only slightly improved the speed, but also causes those nodes to occasionally fail due to "Input errors".  
  • I turned off my anti-virus program with little impact.
  • In preferences I turned "Log File Log Level" to "ERROR" and "Maximum working threads for all nodes" to 100.
  • Using Counting loops with a filter instead of Group Loops 
  • rebuilding the raw data (in case the data itself had become corrput) to no effect.
  • Setting nodes to "Keep all in memory".  This seemed to have no effect on performance, other than to increase Ram usage.
  • Rebuilding the workflow from scratch (in case there was a corrupt setting or something) with no effect.
  • Using filters to reduce the number of rows by about 2/3 with minimal impact. 

Hi Notabrick,

One more thing you could try is to increase the number of cells in memory, see our FAQ: