I’m trying to use XPath to extract sections from HTML snippets like that shown below:
<?xml version='1.0' encoding='UTF-8'?>
<div class="large-12 columns" xmlns="http://www.w3.org/1999/xhtml">
<h3>Overview</h3>
<p>Siemens Totally Integrated Administrator (TIA) fails to properly set the module search path to be used by a privileged Node.js component, which can allow an unprivileged Windows user to run arbitrary code with SYSTEM privileges. The PCS neo administration console is reported to be affected as well.</p>
<h3>Description</h3>
<p>Siemens TIA runs a privileged Node.js component. The Node.js server fails to properly set the module search path. Because of this, Node.js will look for modules in the <code>C:\node_modules\</code> directory when the server is started. Because unprivileged Windows users can create subdirectories off of the system root, a user can create this directory and place a specially-crafted <code>.js</code> file in it to achieve arbitrary code execution with SYSTEM privileges when the server starts.</p>
<h3>Impact</h3>
<p>By placing a specially-crafted JS file in the <code>C:\node_modules\</code> directory, an unprivileged user may be able to execute arbitrary code with SYSTEM privileges on a Windows system with the vulnerable Siemens TIA or PCS neo administration console software installed.</p>
<h3>Solution</h3>
<h4>Apply an update</h4>
<p>This issue is addressed in TIA Administrator <a href="https://support.industry.siemens.com/cs/ww/en/view/114358/">V1.0 SP2 Upd2</a>. PCS neo administration console users should apply the mitigations described in <a href="https://support.industry.siemens.com/cs/ww/en/view/109771524">Industrial Security in SIMATIC PCS neo</a>.</p>
<p>For more details see Siemens Security Advisory <a href="https://cert-portal.siemens.com/productcert/pdf/ssa-428051.pdf">SSA-428051</a>.</p>
<h3>Acknowledgements</h3>
<p>This vulnerability was reported by Will Dormann of the CERT/CC.</p>
<p>This document was written by Will Dormann.</p>
</div>
I want to extract all the content between h3 tags, for example the ‘Overview’ section bracketed by the h3 tags Overview and description.
The query:
//*[preceding-sibling::h3[. = 'Overview'] and following-sibling::h3[. = 'Description']]
works in some of the online testers but not with the XPath node.
What do I need to change to make it work in the KNIME XPath node?
P.S. KNIME 3.7.2