WebPage Retriever node throws the following error: Execute failed: NAMESPACE_ERR: An attempt is made to create or change an object in a way which is incorrect with regard to namespaces.
Is there a way to tell the WebPage Retriever node to not validate the XML or at least do not fail on invalid HTML (XML)?
If you uncheck the “Output as XML” option the node will execute successfully and the output will be string. Then you can use the Strings to Binary Objects node and pass its output to the HTML Parser (the output column name in the Webpage Retriever node must be changed). This will solve the problem. Of course this does not make sense here since we can use the HTTP Retriever node without any problems.
yes, we have some updates! We found the issue and fixed it, i.e., it will be available with the next release (4.1.1). However, if you don’t want to wait until then, you could download our nightly build in which it’s already available (https://www.knime.com/form/nightly-build). If you do so, we are happy to receive your feedback if you encounter any other issues!
Btw, the webpage you posted above is working now. It seems like they fixed the issue (unbound prefixes of namespaces).