Binary Object causes Knime instability

mwiegand · August 3, 2023, 2:38pm

Hi,

apologize in advance but this is going to be a rather lengthly bug report.

While in a meeting with @DanielBog (wasn’t it you?) and Peter Schramm (I cannot find his nick), I accidentally stumbled across another bug we chucked away as a glitch (explained further below). At least we concluded it was not worth sharing.

However, I now face a much more severe issue where Knime becomes totally inresonsive requiring me to force quit it. Previewing data seems also impossible (again continue reading). Heap Space is still changing, memory and CPU utilization in the task manager fluctuate too but I only hear the chime after trying to configure the RegEx Extractor from Selenium (@qqilihq FYI).

I post this in this forum as the aforementioned “chucked away” issue was with the XPath Node where the UI became inresponsive for about 30 seconds upon first opening the node configuration. Upon second opening, it was fast as usual.

I presumably tracked it down to the precense of a binary object column. Removing an impage column didn’t made a change and trying to preview the data actually never finishes either. Closing the preview window doesn’t work so I am again required to force quit Knime.

Note: I have extended the example workflow form another bug about the MD5 calculation in the string manipulation node.

My Knime config I cannot share as it is >5MB and ZIP / RAR etc are not allowed for upload …

Best
Mike

mwiegand · August 4, 2023, 9:45am

Update: This is not directly related to the presence of a Binary Object column. I encountered the same issue, albeit not that drastic requiring to kill Knime, when a column / cell had a lot of characters.

I.e. when I extract the HTML page source and try to use the RegEx Node the UI becomes unresponsive for quite some time.

Best
Mike

mwiegand · August 5, 2023, 11:38am

Update: Even with a fresh install (using ZIP package for parallel testing) with only the absolute necessary extensions the performance is problematic to say the least.

The input table has only three tiny images (52 kb each) but disk IO, even though all was cached, is skyrocketing:

Previewing the table with the three images takes around 30 seconds.

ScottF · August 8, 2023, 9:23pm

Hi @mwiegand -

Thanks again for your meticulous testing work. Let me see if I can get the attention of someone on the dev team.

In the meantime, would you mind posting the workflow associated with your latest post - the one with the three small images? That seems like an interesting place to start reproducing.

carstenhaubold · August 9, 2023, 9:58am

Hi @mwiegand,
Thank you so much for the thorough testing! Just a quick detail question for now: were you using the columnar or row-based (previously called “default”) backend in your tests?
Best,
Carsten

wiswedel · August 9, 2023, 11:40am

I tried both 4.7.6 and 5.1.0 with the attached workflow. I can confirm that the binary objects converted to string(!) will cause trouble when rendering them in a table view. It would render funky bytes such as

��JFIF,,��ExifMM*��1+ 2 2; FG

… just a whole lot of it, which takes time.

As a user I would ask: Why do you do those things converting JPEG streams to plain string? (@mwiegand)

I will file a ticket to limit the number of characters shown by default to prevent this in the output view. That also affects 3rd party extensions (@qqilihq, maybe something to consider for the dialog of the Regex Extractor).

mwiegand · August 9, 2023, 2:41pm

Hi @wiswedel, hi all,

I think the funky characters are kind of normal as these, I suppose to recall, are some sort of control but not really text characters. When these strings are displayed, they’ll get interpreted no matter what. Though, I tried copy & pasting the string into Sublime which can / should be able to cope with that and still see them.

Anyways, about the purpose, I am working on a a simple appraoch to extract the XML Meta Data. These are easily extractable when the binary object is converted to a string. Purpose is to create a workflow to identify duplicated images.

About your idea to limit the amount of characters dispalyed. Mabye a lazy loading upon the cell size is getting increased might provide a better experience. It might also allow to still copy the entire cell content comapred to a capped string. It also seems the amount of characters are cut off liek an excerpt but based on Knime’s behavior they actually might not?

What makes me curious, though, is the contradicting behavior of Knime. Whilst only a fixed amoutn of cells are hold in memory, it seems, based on the heavy disk throughput, the entire data is getting read.

@carstenhaubold coming to your question, I am using the default setting which means I am using the default row based backend.

@ScottF glad to help. I have added the corresponding images to the originally posted workflow about the MD5 issue and uploaded it to the HUB. I noticed that the UI, upon trying to configure the row filter, became totally blocked again but only little to no CPU and zero disk I/O. I waited for several minutes but the config dialogue didn’t want to open. Heap space indicator, CPU utilization of around five to six percent and a 0.1 % disk I/O did indicate Knime was working. At the end I had to force quite Knime agian

I can consistently reproduce teh instability for the Row Filter “3_IMG_0941”. I first attached it to the table creator to configure it, the re-wired and executed it. Upon trying to configure it, Knime again “collapsed”.

Upon saving and opening the workflow I observed yet again another, not experienced before, sitaution. Knime did some extensive disk I/O but before, even opening the same workflow right after app launch, the workflow opened rather quick. The usual loading window with the progress bar did not appear.

Hope that helps!

Cheers
Mike

wiswedel · August 9, 2023, 4:41pm

Yes, it does help (a bit). Especially because I learned something about PNG file encoding and XML content contained in it (can’t really help with it – it sounds as if that should be an own custom node).

Trying to split concerns…

1. Table cells showing only part of the content

Lazy loading new content would be an option but is a long stretch in terms of effort. Clipping the content seems most sensible for now - it does copy the entire content of the cell when you copy (Ctrl-C). I’d like to note that this limitation was in KNIME since the very beginning. It always tried to render the entire content - interesting it’s only coming up now as an issue. (We intend to make the change in 5.2.0 - it should be part of the nightly build tomorrow already.)

2. The unexpected I/O

That’s still a mystery. I could not reproduce this with the attached workflow. It just runs fine … until you open an output view or configure Palladian’s Regex Extractor.
In order to diagnose further, could you extract a call stack and attach it here? (I learn something about PNG, you learn something about diagnosing JVM processes - let me know if you need help!)

Thanks!
– Bernd

mwiegand · August 10, 2023, 9:20am

I have found something of interest. The total amount of data of all images is 3.25 GB

However after saving a Knime table file, even after filtering all thumbnails which reduces the amount to just 1.3 GB

The actual table file is a whooping 23 GB in size! I removing the image column reduces the size to a more reasonable 2.7 GB which is easily explained by the presence of the binary object and it’s string representation.

Therefore, I assume Knime, while reading and writing images, is converting them to another format where the only file format which could inflate the size to that amount could be TIFF. Or something else is happening here.

This kind of also explains the heavy IOPS but does not explain that throughput when only a few images are present in the table (not quite certain if I mentioned that here as I posted quite a few things in the recent past).

PS: I removed the image column and saved the data in a Knime table which I loaded for testing purposes in the workflow for another post. Upon scrolling to the columns containing the binary object as well as it’s string representation, the preview still freezes. So the instability issue can be described as:

Unreasonably large table sizes when images are present in a column
Data preview freezes in presence of binary object as a string representation. The presence of just the binary object is totally fine, though. As follows the screenshot displaying the instability issue.

Best
Mike

carstenhaubold · August 11, 2023, 8:10am

Data preview freezes in presence of binary object as a string representation. The presence of just the binary object is totally fine, though. As follows the screenshot displaying the instability issue.

Just to confirm here as well as in Columnar Backend: Data Preview takes much more time to display or never finishes - #6 by carstenhaubold : the fix by @wiswedel to restrict the string length for rendering does mitigate this problem. You can try it out in the current KNIME nightly build.

mwiegand · August 11, 2023, 10:05am

Thanks Carsten but unfortunately I’ve got more to add as Knime becomes almost unusable.

I extracted the XML String from just ten Binary Objects (16 MB in total), converted it to XML and extracted the XMP Meta Data EXIF / ICTC data. Saved each data in a Knime table file type. I reported that Knime seems to seriously inflate the data size when images are present. I noticed this also happens for String data too.

Adding to this I continue to face more inconsistencies like:

Node configuration (XPATH with classic UI) not opening when a data preview is being rendered but the Column Filter which already got the new web based UI is able to get opened, though
Date preview displays old data (I filtered for one row with only the XML result)

Overall, I feel Knime 5.1 provides an really inconsistent experience. Whilst it contains some quality of life updates, the mixed experience with the many surprises and regressions (at least for me) make this update feels wonky at by far, sorry to be frank about this, the worst since years.

Maybe there is too much going on with the modern UI beta sneaking into classic UI and in parallel the the columnar backend beta. I know that whilst i.e. the columnar backend should, when disabled, not interfere but the fact that the code is still present can cause unintended issues. Like modern UI designs sneaking into the classic UI.

Back when I managed a global infrastructure we kept the code base and versions clean when upgrading from PHP 5.6 to 7. For many reasons like stability, comparability and manageability and more, I’d personally vote to keep the code base clean by properly splitting these branches.

Summarizing the past few weeks, I have never spend so much time, thankfully I have vacation atm, to try to support Knime and regain operation-ability. Currently, I am constantly running into dead ends with Knime and new issues popping the more I investigate and try to find workarounds which allow me to continue to work

bare with me if that sounds like an accusation or unnecessary complain but I really just want to help you, paying something back which Knime allowed me to accomplish in the past years.

Cheers
Mike