Loop End nodes very slow

Dear Knimers,

is it just me or are Loop End nodes nowadays kind of… slow with long tables?

This is the case both with the default table backend and the columnar table backend.

Knime 4.7.2 on Ubuntu 22.04, 64 GB RAM, -Xmx16384m

Best
Aswin

KNIME_project5.knwf (13.0 KB)

I should add that the Loop End node takes 20 seconds every loop iteration, not just the last one.

Hi, Aswin,

It seems a bit slow in my Mac.
2.9 GHz 8 cores Intel Core i7 -Xmx28048m

And another thing I notice, there is a bug…When I set Counting Loop Start to 2, and set Loop End to Unique row IDs by appending a suffix. The Loop End throw error…

Btw, can you share your font config? I found your node comment font to be very good-looking. The font settings in KNIME are too scary, too many configurations…

Best,
HaveF

2 Likes

I can confirm it is indeed slow. (KNIME version 5) (around 10 seconds for me)

1 Like

I’m in KNIME 4.7.2 on Windows 10, Intel® Core™ i9-8950HK Processor with 6 cores, 16 GB of RAM,

-Xmx12126m

Didn’t notice it to be particularly slow:

image

1 Like

I made a mistake; it seems that the Loop End duration DOES depend on the table backend (KNIME 4.7.2). Joining the rest here in using the Timer Info node:

Columnar table backend:
image

Default backend:
image

Much faster with the default backend than with the new columnar backend.

So, even though the text in the preferences state that “a columnar representation, which gives noticeable speedups over the default format”, this is definitely not always the case. I hope that this can be optimized/improved.

Now I am wondering, I am not yet using KNIME 5, but does the user in KNIME 5 still have the option to use the old default backend?

By the way, @HaveF yes I noticed that bug too!!

Best
Aswin

Besides this slow problem, I also find some bugs related to loop nodes(includes some other loop nodes) and ArrowColumnStore. Sometimes there are errors when use loop. I can’t share the workflow, and it didn’t happen very often, so I ignored it. Now is a reasonable time to bring it up. Below is the history I found in the log file, hoping to bring some clues:

2023-05-06 12:43:41,948 : ERROR : KNIME-Worker-94-Loop End 7:79:27 :  : LocalNodeExecutionJob : Loop End : 7:79:27 : Caught "IllegalStateException": Memory was leaked by query. Memory leaked: (1552)
Allocator(ArrowColumnStore) 0/11576/112640/9223372036854775807 (res/actual/peak/limit)

java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (1552)
Allocator(ArrowColumnStore) 0/11576/112640/9223372036854775807 (res/actual/peak/limit)

	at org.apache.arrow.memory.BaseAllocator.close(BaseAllocator.java:437)
	at org.knime.core.columnar.arrow.AbstractArrowBatchReadable.close(AbstractArrowBatchReadable.java:100)
	at org.knime.core.columnar.arrow.ArrowBatchStore.close(ArrowBatchStore.java:113)
	at org.knime.core.columnar.data.dictencoding.DictEncodedBatchWritableReadable.close(DictEncodedBatchWritableReadable.java:108)
	at org.knime.core.columnar.cache.object.ObjectCache.close(ObjectCache.java:199)
	at org.knime.core.data.columnar.table.WrappedBatchStore.close(WrappedBatchStore.java:222)
	at org.knime.core.data.columnar.table.DefaultColumnarBatchStore.close(DefaultColumnarBatchStore.java:349)
	at org.knime.core.data.columnar.table.ColumnarRowReadTable.close(ColumnarRowReadTable.java:201)
	at org.knime.core.data.columnar.table.AbstractColumnarContainerTable.clear(AbstractColumnarContainerTable.java:198)
	at org.knime.core.data.columnar.table.UnsavedColumnarContainerTable.clear(UnsavedColumnarContainerTable.java:133)
	at org.knime.core.node.BufferedDataTable.clearSingle(BufferedDataTable.java:972)
	at org.knime.core.node.Node.disposeTables(Node.java:1727)
	at org.knime.core.node.Node.cleanOutPorts(Node.java:1691)
	at org.knime.core.node.workflow.NativeNodeContainer.cleanOutPorts(NativeNodeContainer.java:624)
	at org.knime.core.node.workflow.WorkflowManager.restartLoop(WorkflowManager.java:3633)
	at org.knime.core.node.workflow.WorkflowManager.doAfterExecution(WorkflowManager.java:3500)
	at org.knime.core.node.workflow.NodeContainer.notifyParentExecuteFinished(NodeContainer.java:688)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:238)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)

Best,
HaveF

1 Like

Since there is no KNIME guy to answer this. It looks like the information gets lost in the noise. Ping @AlexanderFillbrunn. Please help us to add a ticket in your internal system :grin:

1 Like

Here is the KNIME guy to answer this :slight_smile:
Sorry for the delay.
To answer the initial question: Yes, Loop End nodes are currently slow and there is a technical reason for that.
The good news is that this will change with 5.1.
What definitely should not happen are the exceptions that @HaveF observed.
I understand that you can’t share your data but can you perhaps share the rough outlines such as the type of loop used, the number of iteration, data types and the table dimension?

Cheers,
Adrian

5 Likes

Dear @nemad

I think this bug below is the same as or related to the one mentioned by @HaveF :

  • Load the workflow from my original post
  • Change the “Number of loops” in the “Counting Loop Start” node config to 2
  • Run it once
  • After the second iteration it complains about duplicate RowIDs, which is normal.
  • Change the “Row key policy” in “Loop End” node to “Generate new rowIDs”
  • Run the workflow again
  • On my system “Loop End” then gives the error: “Execute failed: container delegate has already been closed”.

Best
Aswin

Hi, @nemad :partying_face: :partying_face: :partying_face:
Well, except for the slowness problem @Aswin mentioned. Actually, I mentioned two bugs. One of them is for row ID related bugs. The second one is related to memory leaks.

The first bug is obvious. I was lucky to construct a minimal workflow for the second bug.
This bug doesn’t appear every time, but it should appear easily. Hope it fixed in 5.1 already :joy:

loop_bug.knwf (23.6 KB)

And I have third bug to report… :joy: :joy: :joy:
If I have several workflows opened, and some of them are not saved. When I export one workflow, the other workflows are prompted to save…Of course it’s not a big deal but it’s always kind of weird.

Best,
HaveF

Thank You @Aswin and @HaveF,
I opened an issue for the RowID bug and tried out the workflow provided by @HaveF in the 5.1 nightly but could not reproduce the issue after multiple attempts. There has been a bit of work on that code base, so it might just be that it is already fixed, but I’ll keep a look out for the bug you described.
Regarding the export, I agree that it is annoying but haven’t heard back as to why it’s necessary, yet.

Cheers,
Adrian

2 Likes

Dear @HaveF

I also did not get any memory leak errors after running the loop 5 times. Perhaps the problem is MacOS specific?

By the way, my font config is the default font config on Ubuntu Linux. The font is literally called “Ubuntu” (size 11 for node labels).

Best,
Aswin

1 Like

@Aswin Thanks :partying_face: :partying_face: :partying_face:

Best,
HaveF