Reading large XML Files

Hi All,

I am running into an issue with KNIME (linux 4.3.3) crashing when trying to read a large XML file even when quite a bit of ram is allocated (50g). It’s a hard crash so no logs or messages that I can find but it may be easy to reproduce with the file linked below.

I tried with both the XML Reader and the Line Reader (the parsing is quite simple). Both had the same behavior. In the end was able to parse it in python using xml.etree.ElementTree without issue and export as csv but I will pass along the observation anyway in case there is a bug that can be squashed here.

Cheers!

Aaron

https://hmdb.ca/system/downloads/current/hmdb_metabolites.zip

1 Like

Hi Aaron,
thanks for reporting the problem. We will look into the problem when we will rewrite the XML Reader to be compatible with the new file handling framework. Did you try split-up the document using the XPath query option? This way the XML file is split up into smaller chunks that can be cached to disc earlier. For an example see this workflow:

Bye
Tobias

4 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.