today, I’ve been follow this workflow
And I found the result of Xpath node as picture below :
And then, I follow this step. I use Xpath Node for scrapping a websites that I take.
but the results I get are not the same as the workflow above, where the results from the Xpath node for the Item column are of type String data, not XML.
Even though all my configurations are the same as the workflow example above.
Anyone can help me?
In the Xpath query settings, you can change the return type to “Node cell” and then you will have your desired output.
I have missed the configuration for this one
I still have one question about Xpath…
I wanna fetch the topic by path : /item/description/
But, I want to delete a picture in the description. Do you know of a path expression to handle it?
Is it possible for you to share the XML? Where is the topic?
Maybe XML and the value that you want.
If not possible, I think, after parsing the XML with Xpath, you can clean it using a String Manipulation node.
below is a link of the XML file that I used :
the part of tag in < item > / < description > / …
there is a tag for “img”
I only want to take the part of the article that is written in the < description > tag. However, some articles in the < description > have tags to load images. While I don’t need it.
Can I remove the existing part of “img” ?
I’ve tried using the String manipulation node. By using the “replace” function. But this doesn’t work.
strip(regexReplace($description$, "<img\\s.*?/>", ""))
Where $description$ is the output column of the xpath …/item/description
Thankyou so much, @armingrudd
where can I learn more about its functions and uses?
Here is a short blog post which you may find helpful.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.