Xpath Query After Strong tags and Between Two BR tags

Hi,

I want to retrieve data from a web. I had some stuck on Xpath query when I have to get contents. Below are XML code:

I had try this query but nothing return //dns:html/dns:body/dns:div[5]/dns:div[2]/dns:div[1]/dns:article/dns:div[3]/dns:div[1]/dns:br

The result that I hope are like this.

image

thank you so much for your help before,
Nanda

Hi @Nanda_Rukmana,

I would suggest you grab the whole section and then work on it with the String Manipulation node.

If you provide me with the link to the webpage I can help further. If the webpage is not accessible by others, just extract the whole section containing whatever exists in the image and then provide me with the string and I will send you the expression to clean it.

:blush:

2 Likes

Hi @armingrudd,

I had try too grab whole section but only got like this

The link to webpage are as follow

thank you,
Nanda

Here you are:

1z

extract_text.knwf (108.4 KB)

:blush:

5 Likes

Hi Mr. @armingrudd,

Its totally worked perfect. Is there any site or reference for me to study cleansing text like yo did. I had learn your code but I don’t understand the pattern.

thank big for your help.

Thanks for your kindness @Nanda_Rukmana but I just figured out that the result text is not exactly what you asked for.

Please wait for a few minutes and I will get back to you with a true solution!

:blush:

1 Like

Here is the true solution:

extract_text.knwf (108.6 KB)

I changed the output type of the Xpath node to Node so I could remove the content of the table tags (e.g. Baca juga: Demi Tahajud Reuni 212, Dadang dari Bekasi Bawa 2 Anak Kecilnya) in the String Manipulation node with some! regex.

Now the output is exactly what you want!

:blush:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.