XML as string

Hello,
I am a beginner with KNIME and I have a problem with XML.
Within my workflow is included GET Request node. Using this node, I would like to fetch XMLs. My problem is the XMLs which I fetch within my workflow are strings not XML format. This is the problem for the next steps in my workflow (particularly for the XPath node).

I have tried unsuccessfully to change the format using Column To XML or String To XML nodes. If I use node Column To XML, I obtain this output.

Thanks for your ideas and advice
Thillia

Hi @Thillia_01 and welcome to the Community.

Based on your screenshot, it looks like the site you are sending a GET request to is returning an HTML page.

Could you share the url, or is it private? Or better yet, can you share your workflow?

2 Likes

Hi @Thillia_01,

Try using Strings to Binary Objects and then HTML Parser. This will give you the page in XML.

But before applying this, if you are requesting a webpage, why don’t you use Webpage Retriever or HTTP Retriever + HTML Parser?

:blush:

4 Likes

My current version of this workflow

solution_of_isomers_2.knwf (14.2 KB)

Thank you for your advice. I will try it too

Hi @Thillia_01 , thank you for the workflow. As I suspected, the URL you are submitting the GET request to is just an html page. There is no XML string on that page.

You can open that URL directly from your web browser, and that’s basically what Knime is getting too. You can even compare what Knime retrieved and the web page html source, it’s the same.

There’s simply no XML data on that page.

You might need to access the XML data via the page’s webservices:

2 Likes

@Thillia_01 ,

The content you are looking for is event based so you cannot get them using basic retriever nodes.
In addition to the website’s APIs suggested by @bruno29a, you can use Selenium nodes to scrape the page(s). You can request for a free trial license of Selenium nodes at their website.
Here is an example workflow:
solution_of_isomers_selenium.knwf (71.1 KB)

:blush:

2 Likes

I think you can just call the site’s api’s url with a GET request.

They seem to have all the apis here:
https://www.ebi.ac.uk/chembl/api/data/docs

You can even test the apis on the page. For example, I did this:

I did a lookup with the ID that you have @Thillia_01 , and you can see the resource_url on the right after the lookup, which is “https://www.ebi.ac.uk/chembl/api/data/molecule/CHEMBL294506”.

This page has XML data, so if you use that url in your Knime workflow, you can retrieve the XML data.

I replaced the url in your string manipulation, and I am able to get the XML.

Here’s the workflow:
solution_of_isomers_2_bruno.knwf (33.7 KB)

EDIT: FYI, you don’t even need to convert to XML (String to XML or Column to XML). The retrieved data is already in XML.

This is the result of the GET request node:

As you can see, the XML column is already of type XML.

2 Likes