Reading large XML Files

Hi All,

I am running into an issue with KNIME (linux 4.3.3) crashing when trying to read a large XML file even when quite a bit of ram is allocated (50g). It’s a hard crash so no logs or messages that I can find but it may be easy to reproduce with the file linked below.

I tried with both the XML Reader and the Line Reader (the parsing is quite simple). Both had the same behavior. In the end was able to parse it in python using xml.etree.ElementTree without issue and export as csv but I will pass along the observation anyway in case there is a bug that can be squashed here.



Hi Aaron,
thanks for reporting the problem. We will look into the problem when we will rewrite the XML Reader to be compatible with the new file handling framework. Did you try split-up the document using the XPath query option? This way the XML file is split up into smaller chunks that can be cached to disc earlier. For an example see this workflow:



