When using ExcelReader in a loop, an IllegalStateException occurs at random times.

Hello community!

I’m using v5.2.3 on Windows 10.

As the title says, when I read Excel with ExcelReader in a loop, sometimes an “IllegalStateException” error occurs and the process stops.

If I select the node in the failed state and click “Execute”, it ends normally.

What do you think is causing this?

*This error occurs when I open a workflow created in v4.7.3 in v5.2.3. It does not occur in v4.7.3.
*I extracted the v5.2.3 zip file, opened the same workspace and ran it, and got an error.
*The same problem occurs even if I drag and set a new ExcelReader.

This is the log when the problem occurred.


2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-64-Table Row to Variable Loop Start 3:31 : : NodeContainer : Table Row to Variable Loop Start : 3:31 : Extract 3:96 has new state: EXECUTING
2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : WorkflowManager : Excel Reader : 3:96:1 : Excel Reader 3:96:1 doBeforePreExecution
2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : NodeContainer : Excel Reader : 3:96:1 : Excel Reader 3:96:1 has new state: PREEXECUTE
2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : WorkflowManager : Excel Reader : 3:96:1 : Excel Reader 3:96:1 doBeforeExecution
2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : NodeContainer : Excel Reader : 3:96:1 : Excel Reader 3:96:1 has new state: EXECUTING
2024-06-10 14:14:38,854 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : LocalNodeExecutionJob : Excel Reader : 3:96:1 : Excel Reader 3:96:1 Start execute
2024-06-10 14:14:38,947 : DEBUG : KNIME-Excel-Parser-7 : : ExcelParserRunnable : Excel Reader : 3:96:1 : Excel sheet parsing started
2024-06-10 14:14:39,031 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : ExcelRead : Excel Reader : 3:96:1 : Canceled parser thread
2024-06-10 14:14:39,033 : DEBUG : KNIME-Excel-Parser-7 : : ExcelParserRunnable : Excel Reader : 3:96:1 : Excel sheet parsing problem
java.nio.channels.ClosedByInterruptException
at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.endBlocking(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.readInternal(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.read(Unknown Source)
at org.apache.commons.compress.archivers.zip.ZipFile$BoundedFileChannelInputStream.read(ZipFile.java:1487)
at org.apache.commons.compress.utils.BoundedArchiveInputStream.read(BoundedArchiveInputStream.java:82)
at java.base/java.io.BufferedInputStream.fill(Unknown Source)
at java.base/java.io.BufferedInputStream.read1(Unknown Source)
at java.base/java.io.BufferedInputStream.read(Unknown Source)
at java.base/java.io.SequenceInputStream.read(Unknown Source)
at java.base/java.util.zip.InflaterInputStream.fill(Unknown Source)
at org.apache.commons.compress.archivers.zip.InflaterInputStreamWithStatistics.fill(InflaterInputStreamWithStatistics.java:52)
at java.base/java.util.zip.InflaterInputStream.read(Unknown Source)
at org.apache.commons.compress.archivers.zip.InflaterInputStreamWithStatistics.read(InflaterInputStreamWithStatistics.java:67)
at java.base/java.io.FilterInputStream.read(Unknown Source)
at org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.read(ZipArchiveThresholdInputStream.java:80)
at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:102)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.streamed.xlsx.XLSXRead$XLSXParserRunnable.parse(XLSXRead.java:170)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.ExcelParserRunnable.run(ExcelParserRunnable.java:133)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.knime.core.util.ThreadUtils$3.runWithContext(ThreadUtils.java:525)
at org.knime.core.util.ThreadUtils$ThreadWithContext.run(ThreadUtils.java:340)
2024-06-10 14:14:39,033 : DEBUG : KNIME-Excel-Parser-7 : : ExcelParserRunnable : Excel Reader : 3:96:1 : Closing Excel sheet parser resources
2024-06-10 14:14:39,033 : DEBUG : KNIME-Excel-Parser-7 : : ExcelParserRunnable : Excel Reader : 3:96:1 : Excel sheet parser got interrupted while trying to indicate that parsing stopped
java.lang.InterruptedException
at java.base/java.util.concurrent.locks.ReentrantLock$Sync.lockInterruptibly(Unknown Source)
at java.base/java.util.concurrent.locks.ReentrantLock.lockInterruptibly(Unknown Source)
at java.base/java.util.concurrent.ArrayBlockingQueue.put(Unknown Source)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.ExcelRead.addToQueue(ExcelRead.java:376)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.ExcelParserRunnable.run(ExcelParserRunnable.java:159)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.knime.core.util.ThreadUtils$3.runWithContext(ThreadUtils.java:525)
at org.knime.core.util.ThreadUtils$ThreadWithContext.run(ThreadUtils.java:340)
2024-06-10 14:14:39,035 : DEBUG : KNIME-Worker-65-Excel Reader 3:96:1 : : Node : Excel Reader : 3:96:1 : reset
2024-06-10 14:14:39,035 : ERROR : KNIME-Worker-65-Excel Reader 3:96:1 : : Node : Excel Reader : 3:96:1 : Execute failed: (“IllegalStateException”): null
java.lang.IllegalStateException
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.ExcelRead.close(ExcelRead.java:365)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.ExcelTableReader.readSpec(ExcelTableReader.java:133)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.ExcelTableReader.readSpec(ExcelTableReader.java:1)
at org.knime.filehandling.core.node.table.reader.DefaultMultiTableReadFactory.readIndividualSpecs(DefaultMultiTableReadFactory.java:146)
at org.knime.filehandling.core.node.table.reader.DefaultMultiTableReadFactory.create(DefaultMultiTableReadFactory.java:136)
at org.knime.filehandling.core.node.table.reader.MultiTableReader.createMultiRead(MultiTableReader.java:129)
at org.knime.filehandling.core.node.table.reader.MultiTableReader.getMultiRead(MultiTableReader.java:201)
at org.knime.filehandling.core.node.table.reader.MultiTableReader.readTable(MultiTableReader.java:166)
at org.knime.filehandling.core.node.table.reader.TableReaderNodeModel.execute(TableReaderNodeModel.java:160)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.ExcelTableReaderNodeModel.execute(ExcelTableReaderNodeModel.java:106)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:588)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1297)
at org.knime.core.node.Node.execute(Node.java:1059)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:595)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.nio.channels.ClosedByInterruptException
at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.endBlocking(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.readInternal(Unknown Source)
at java.base/sun.nio.ch.FileChannelImpl.read(Unknown Source)
at org.apache.commons.compress.archivers.zip.ZipFile$BoundedFileChannelInputStream.read(ZipFile.java:1487)
at org.apache.commons.compress.utils.BoundedArchiveInputStream.read(BoundedArchiveInputStream.java:82)
at java.base/java.io.BufferedInputStream.fill(Unknown Source)
at java.base/java.io.BufferedInputStream.read1(Unknown Source)
at java.base/java.io.BufferedInputStream.read(Unknown Source)
at java.base/java.io.SequenceInputStream.read(Unknown Source)
at java.base/java.util.zip.InflaterInputStream.fill(Unknown Source)
at org.apache.commons.compress.archivers.zip.InflaterInputStreamWithStatistics.fill(InflaterInputStreamWithStatistics.java:52)
at java.base/java.util.zip.InflaterInputStream.read(Unknown Source)
at org.apache.commons.compress.archivers.zip.InflaterInputStreamWithStatistics.read(InflaterInputStreamWithStatistics.java:67)
at java.base/java.io.FilterInputStream.read(Unknown Source)
at org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.read(ZipArchiveThresholdInputStream.java:80)
at org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:102)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanLiteral(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanAttribute(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.streamed.xlsx.XLSXRead$XLSXParserRunnable.parse(XLSXRead.java:170)
at org.knime.ext.poi3.node.io.filehandling.excel.reader.read.ExcelParserRunnable.run(ExcelParserRunnable.java:133)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.knime.core.util.ThreadUtils$3.runWithContext(ThreadUtils.java:525)
at org.knime.core.util.ThreadUtils$ThreadWithContext.run(ThreadUtils.java:340)

Hi @locodust6,

thanks for reporting the problem (sorry for the delay, I just saw this).

I am not sure why the exception breaks the execution. If the node is in normal operation (like you said, you can execute it again just fine) and not canceled, the Excel parser (the runnable) should finish successfully. For some reason, it gets interrupted (which means canceled in this instance), which triggers the channel exception, which is apparently not handled correctly. Maybe there is a race condition between the parser and helper (that produces table output) when shutting down everything normally…

The problem is that I cannot reproduce it in 5.2.5. I tried:

  • reading a large file (150000 rows) in a counting loop with 100 iterations
  • reading a tiny file (3 rows) in a loop with 10000 iterations

From the log (thanks for attaching it!) I can see that you likely provide different input files via a flow variable. I tried to replicate that as well, just reading the same input file repeatedly, but this neither fails for the large nor for the small test file I have…

Might you be able to supply a small example workflow with some examle Excel files, that reproduces the problem?

2 Likes

Hi @locodust6, whilst I don’t know the cause of the problem, in cases where strange things happen on reading files, I would generally ask the following:

are the files that are being read:
(1) On the local drive of the machine where the workflow is running
(2) On a local network share
(3) Via a url (i.e. a web location)
(4) In a “cloud folder” on the local drive (such as onedrive, dropbox etc) where the file is being synchronised with a cloud service
(5) From Sharepoint
(6) Other?

For anything but option (1), there is always the possibility of some kind of latency occurring which causes problems and might only be apparent during some form of repetitive processing. As mentioned, this may not be the cause but it’s always worth exploring with a view to ruling this out.

(There is a difference in the underlying java software used for reading Excel in KNIME 4.7 and KNIME 5.x which has been apparent from another topic)

Another thought, what happens if you put a Wait node with a time delay of maybe 10 seconds (quite a long delay, just for testing!) immediately after the Excel Reader? Does it then work?

Hi @hotzm,

Thank you for your reply.

I am attaching a sample workflow file and an Excel file that demonstrates the issue.
I placed an Excel file in the “C:\work\knime_read_test” folder and executed it.

ExcelReader_IllegalStateException.knwf (31.3 KB)
24_04_01.xlsx (76.7 KB)

Thank you for your support.

Hi @takbb,

Thank you for your reply.

are the files that are being read:
(2) On a local network share

The file is located on a file server on a local network share.
The same issue occurred whether I connected via wired LAN or wireless LAN.
It also happened when I copied it to a local folder on my PC.

Another thought, what happens if you put a Wait node with a time delay of maybe 10 seconds (quite a long delay, just for testing!) immediately after the Excel Reader? Does it then work?

I only checked it on a local network share, but it was somewhat more stable than before I added wait.
However, an error still occurred midway through.
*At this time, 74 files were being targeted and the error occurred on the 49th file.

Thank you for your support.

Thanks for attaching the workflow. Unfortunately, it does not happen when I execute it on your file :frowning: .
You could try the following (just guessing here, sorry…):

  • Choose “By position” 0 if the sheet is always the first sheet.
  • Copy the selected files into a temporary folder, and then use the Files in Folder option of the Excel Reader to read all files in the temp folder together.

I’ll keep trying to reproduce on my end and look through the code to see if I spot something obvious.

2 Likes

Thank you for checking.
And, sorry for the late confirmation.

What… No matter how many times you try, there’s no error?

  • Choose “By position” 0 if the sheet is always the first sheet.
  • Copy the selected files into a temporary folder, and then use the Files in Folder option of the Excel Reader to read all files in the temp folder together.

In both cases the same error occurred.

For some reason, if I delete all the contents of the cells in columns K, U, and AE, the error doesn’t seem to occur.

Oh no, I never sent my second comment… here’s what I wanted to reply shortly after :crying_cat_face: :

It finally reproduced. The good news is that I think I know what is going on (buggy cancellation handling in table spec guessing), the bad news is that I don’t know if we have a reliable workaround…

Could you try to run the Excel node once with the normal “Limit scanned rows” option (I think you had it set to 177 rows)? After running it once, please open the dialog and change it to 1. I know that sounds super weird, but for me it does not throw the exception in this case even if I loop 10.000 times. Before doing that, it happened reliably with your workflow & file.

In the meantime, I have opened AP-22737 to fix the bug.

2 Likes

I felt some relief knowing that I wasn’t the only one with this problem.

It’s good that the cause has been identified.
Hopefully we can find a way around it.

Wow, that’s so strange.

Thank you for submitting the ticket.
I understand that it is also being considered on GitHub.

Please continue to support us.

Hi @locodust6,

the fix is live on the current nightly build, which you can download here – to be released in the upcoming KNIME Analytics Platform 5.3. It will also be included in the next bugfix release for KNIME Analytics Platform 5.2.

Thanks again for bringing it to our attention and providing a case with which we could reliably reproduce the bug!

1 Like

Hi @hotzm,

I actually had an issue with this, and when I tried it with the current nightly build, it worked without any problems!
Thank you so much!!

I’m looking forward to the release.
Thank you for the quick resolution and courteous response.

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.