Convert XML into columnar data

I want to take an XML file and convert it into columnar data. My data looks like the following:

<xml>
    <Response>
        <ID>10001</ID>
        <Name>John Doe</Name>
        <Course01>BW101</Course01>
    </Response>
</xml>

There will be multiple Responses and up to 150 nodes within the Response tag. I have the XML Reader node correctly set up to read the Response node into rows, but how do I get the subnodes of Response converted into columns? I do not want to set up an XPath node for each column as my XML files have different formats for different purposes.

Thanks for the help!

Hi,

the XPath node can have multiple output columns. So you need only one XPath node.

Best, Iris

How do I specify the XPath so it generates the column names automatically?

David

Normally you can use the name() function to get the element's name. In your case that would look like this:

name(/xml/Response/ID)

Unfortunately the XPATH node needs a relative query for the column name which, for some reason, cannot be combined with the name() function. I hope the KNIME guys can shed some light on this issue.

Best,
Marc

The decision for a relative path for the name query was made on purpose. The standard use case is that you have the column name in some attribute or child element of the current element in which case you simply specify "/@name" (for example).

If Java supported XPath2 you could simply use "/name()" but with XPath 1 unfortunately this isn't possible.

As a workaround you can process your XML file with the attached XSLT which adds "name" attributes for every element inside a "Response". Then you can use "@name" as the column name query in the XPath node.

Hi @thor,

the XSLT was a very good starting point but the attribute referencing throws an error: “Execute failed: XPath expression cannot be compiled.”

By fetching the name in the regular request it works but not with the relative name reference above.

Any ideas about my mistake? I tried a couple for combinations but facing the issue each time …

Thanks a lot
Mike

This looks like a bug which is caused by the @ characters in the XPath value query. I have no workaround at the moment until this bug will be fixed.

I haven’t tested it myself, but a possible solution might be to convert the xml file to a json file (XML to Json Node) and then use the Json to Table Node. If the structure of the xml is simple and the file size is not to big, this might possibly be a solution.