extract node names as column headers using XPath node?

I've used the Palladian nodes to retrieve data from a web server, and parse the returned html to an xml column. All good.

Using the XPath node, I can extract the values from each node into separate columns in a table so long as the column header is specified in the New Column Name field (that is, a fixed value for all columns, followed by a number: COLUMNnnn).

But I can't work out how to extract the node names as column names.

My XPath Value query is //*
XPath query for column name: /name()
Xpath data type: String cell
Multiple tag options: Multiple columns

This gives in the summary section Column Name = Value of //*/name()

But on executing the (knime) node:

Error Xpath 2:23 Execute failed: Xpath expression cannot be compiled  

Instead of /name(), I have also tried /local-name(), /node-name().

Any pointers?

(the other)
Simon

Hi Simon the other,

I'm no expert on the XPath nodes, but I assume the problem is as follows (quoted from the node documentation): "The node supports XPath 1.0."

Queries such as "//*/name()" however are only supported in XPath 2.0 (see here).

Admittedly, I have no clue if there's any workaround.

Kind regards,
Philipp

Thanks Philipp. I figured that might be the case, but I couldn't work out how to construct a 1.0 compliant version from within the XPath node.

Perhaps I should look at the Selenium nodes and see if it's easier to parse the output from those nodes into separate columns?

(the other)
Simon

Hey Simon,

with the Selenium Nodes, you can check the option "Append additional WebElement information", which will give you a dedicated column with the extracted node names. You can use the same XPath query //* to get all elements on the page. However, the elements will be extracted row-wise (i.e. for each found element, a row is created). I guess it should be easy to transform that into a column-based table.

node description

Hope I understood your issue correctly. In case you have any improvement suggestions, feel free to get in touch (forum or email).

Best,
Philipp

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.