How to move String to XML ?

Hallo everyone!
today, I’ve been follow this workflow
And I found the result of Xpath node as picture below :

And then, I follow this step. I use Xpath Node for scrapping a websites that I take.
but the results I get are not the same as the workflow above, where the results from the Xpath node for the Item column are of type String data, not XML.


Even though all my configurations are the same as the workflow example above.


Anyone can help me?

Thankyou,
Best regards
Veni

Hi @veniapputrii,

In the Xpath query settings, you can change the return type to “Node cell” and then you will have your desired output.

1 Like

Thankyou, @armingrudd


I have missed the configuration for this one

4 Likes

Hi, @armingrudd
I still have one question about Xpath…
I wanna fetch the topic by path : /item/description/
But, I want to delete a picture in the description. Do you know of a path expression to handle it?
image

Is it possible for you to share the XML? Where is the topic?
Maybe XML and the value that you want.

If not possible, I think, after parsing the XML with Xpath, you can clean it using a String Manipulation node.

3 Likes

Hallo, @armingrudd

below is a link of the XML file that I used :
https://www.suara.com/rss/news

the part of tag in < item > / < description > / …
there is a tag for “img”

I only want to take the part of the article that is written in the < description > tag. However, some articles in the < description > have tags to load images. While I don’t need it.

Can I remove the existing part of “img” ?

I’ve tried using the String manipulation node. By using the “replace” function. But this doesn’t work.

Try this:
strip(regexReplace($description$, "<img\\s.*?/>", ""))

Where $description$ is the output column of the xpath …/item/description

1 Like

solved!!
Thankyou so much, @armingrudd :grinning:
where can I learn more about its functions and uses?

Here is a short blog post which you may find helpful.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.