Failure to read large datasets

Hi,

I often stumble on the same error message when dealing with large datasets

ERROR     String Manipulation     Execute failed: Exception while accessing file: "C:\Users\tdsuser\KNIME_Workflows\Wartmann2\Iterate List of Files (#4)\Loop End (#3)\port_1\data.zip": invalid entry size (expected 2251885780355920 but got 6282670088 bytes)

 

I have this problem (using 2.7.2) when working on Win7 system, not on Mac. Do you have any suggestions?

Thanks,

Marc

Hi Marc, that sounds like a lot of data, big data. Just to get a feeling about the amoun of data together in all files you are concatenating in the Loop End, can you give us a number? It sounds like a integer overflow because KNIME's DataTables are currently restricted to Integer.MAX_VALUE. Are you running win32 oder win64 bit version of KNIME. I assume the zipper library is system dependent, that is, Mac and Windows might work completely different.

Hi Gabriel, it's a 12 mio line file. My biggest has been 750 mio, so yes I do have to handle large amount of data. I run the win 64 version. I was considering Pervasive Partner solution for our cluster. Does it run without a graphical interface? Sorry, not the right place to ask this question.

Using Pervasive as an alternative executor would be an option. They provide an add-on (basically an update site which of course can also be used running KNIME in headless mode) to KNIME for big data processing, further details are available here. However, I would like to understand your problem from your original post where you seem to get an exception when processing like x-million rows?

Yes, I get from time to time the error about number of bytes of large files. This happens especially, when the data was loaded a while ago and I work on the data a few days later only. It does not seem to happen on "fresh" data.

Thanks a lot for following up. If this happens again can you please check the knime.og located in your workspace/.metadata/knime to see if you get a detailed stack trace. Cheers, Thomas

Hi Thomas,

 

O will do, but might some time until that happens again.

Cheers, Marc