I have encountered some problems with the XPath node. I am working with the NCBI Gene Reports for some genome-wide analysis and data mining. When querying certain attributes/values the node gives me missing values where there shouldn't be any missing values. And - even worse - the missing values occure randomly, which means that executing the same node 100 times gives 100 different results.
This issue can be reproduced with all versions of KNIME (at least from 2.9.4 to 2.11.0). I have created an example workflow with two equally configured and already executed nodes which have - of course - two different output tables. Unfortunately the workflow exceeds the upload limit of this forum by more than 350 MB.
As a workaround (with KNIME 2.11) you can try to convert it to JSON (XML to JSON), use JSONPath and if you need the results in xml try to convert back with JSON to XML. It has not 1-1 match to XPath, but it has quite similar features.
No. I am not a regular XML nodes user. Just thought this might be a workaround (JSON and XML both represent trees and probably these work similarly, but as they are different implementations it might work with JSON).
To solve the mystery: the XML documents contained references to an external DTD. Occasionally this DTD could not be downloaded properly leading to parsing errors and finally to the missing values. We will try to add a cache for external DTDs.