expression node randomly fails due to memory limitations?

Hi all!
I am using an expression node to check whether a string (from one column) is contained in another string in another column.
In concrete terms: I want to check if a variable name is present in a chunk of javascript code (the longes of which has 24000 characters). the expression is:

empty_to_missing(if(contains($["CleanCode"], string($["AttributeID (List*)"])), $["AttributeID (List*)"], ""))

Because of cross multiplication the table has 4.6 million lines.

Now the problem is that the node sometimes runs through, while sometimes crashes with an error (“Execute failed: Problem occurred when writing column data.”) at a random percentage, even though the source data hasn’t changed.

the log file shows some "Failed to asynchronously serialize object data." and "Unable to allocate buffer of size 134217728 due to memory limit. Current allocation: ..."

I am running this on a Mac Studio M2 Ultra with 64GB RAM and ‘Xmx48g’ allocated in the knime.ini

thanks for any hint - I tried to achieve this with regexMatch in a String Manipulation node as well but this wasn’t too reliable either.

Hi,

I know it’s not a solution to the actual problem, but have you tried to process the table in chunks using the “chunk loop” construct similar to this one here?

I’m wondering how huge your java script is. Maybe it is an idea to split that up too.

Andi

1 Like

Hi @roberting,

The Expression node relies on KNIME’s Columnar Backend, which allocates data off-heap instead of on the Java Heap (where Xmx applies). Because the off-heap region is auto-configured based on total system memory and your configured Xmx, a large Xmx can shrink the off-heap allocation. Increase the off-heap memory limit as described in KNIME’s documentation, and consider lowering Xmx if needed. We know this is not ideal and are working on improving it. For more details, see this forum post.

1 Like