GroupBy hanging at 60% [SOLVED]

On KNIME 2.10.1 or 2.10.2 the GroupBy node races to 60% completion and then moves very slowly/extremely slowly to completeion.  This is true for both large and small datasets - but not every time.  A look at my server's resource monitor shows large fluctuations in memory usage and knime.exe alternating between "Running" and "Not Responding".  I work with an 8 core Xeon processor and 32 GB of RAM, so this seems like very strange behavior for a dataset of 7 columns and 6000 rows.

-Brian Muchmore

I assume you have changed the knime.ini file to use some of this 32Gb of ram, by default it is set at 1Gb

Simon.

Yes, I give KNIME access to most of my ram.  

This is a weird occurrence because nobody else seems to have my problem, and I haven't noticed it before, although I utilize GroupBy a lot more now than I have previously.

I have never had problems and use it all the time, often on much larger number of rows. 

Is there some setting you have ticked which may not be a common setting, or do you have a specific column data type that may be causing the problem, I.e. One of the molecule data types, images, or some other less frequently used column type. 

Simon.

Nope, the data is quite bland as are the groupby operations.  I wrote this to see if I could find anyone in the woodwork who was experiencing the same thing I was.  My gut is that something on my system is causing this problem, but I thought it could be a hidden bug as well.

Sometimes I have had Knime workflows hang like this, often if you close the workflow and restart Knime it clears some memory problem and then it works better.

Also worth checking what your anti-virus software is doing - if tables are being written to disk, even temporarily, then sometimes this can cause sudden slow-downs.

Steve

Maybe check disk space in your temp directory too & clear out if required?  KNIME can sure eat it up!

Alastair

Thank you sincerely for the suggestions, although none of them fix it.  No matter what I try knime.exe is freezing up and then unfreezing repeatedly when I run the Groupby node.  Oh well, it still runs, just slowly.

1 Like

Could you share the workflow to help diagnose the problem

Simon.

Simon, your original advice stuck with me, and you were write, I did something to the node I should not have.  What I was doing was setting the "Maximum unique values per group" to a billion since I kept hitting the default limit of 10,000.  Suffice it to say, that was a mistake, and I kept propigating that mistake by compying the same groupby node over and over.

Thank you to everybody for their responses.

-Brian

A post was split to a new topic: GroupBy node problems