HTML Parser Problem

Hi everyone,
I am trying parse and get data from below link. However I couldn’t. My chart is very basic that Table Creator->HTML Parser->XPATH
THE LINK BELOW IS NOT WORKING. I got just “?” in Html Parser
https://www.endeksa.com/tr/analiz/izmir/bornova/mevlana/endeks/kiralik/konut

When I change link as " www.endeksa.com/tr/analiz/izmir/bornova/mevlana/endeks/kiralik/konut" in Table Creator, I got BELOW xml code:

<?xml version="1.0" encoding="UTF-8"?> www.zhttps://www.endeksa.com/tr/analiz/izmir/bornova/mevlana/endeks/kiralik/konut

But I need all xml codes in order to get data that i want.
Could you please help me?
Pic1


Please share your workflow here by attaching it to the thread. Thank you!

Endeksa1.knwf (9.7 KB)

Thank you for ur interest!

Thank you!

Please use an HTTP Retriever before the HTML Parser, then it will work:

Example workflow here:

2 Likes

Thank you very much for your help. I imported your .knwf. It worked. However still some differences between webpage elements (in developer tools) and xpath xml-cell preview (In knime) as you attachment


So i can not AddXpath query what i want in knime to get data.

How can i deal with this problem?

Best regards

Hi Ham,

Good to hear that it worked (somewhat :blush: ) - the discrepancy between the DOM view in the browser’s dev tools and the XPath node comes from the fact, that the content is generated via JS, but the HTML Parser will only give you a static view (similar to performing e.g. a curl on an HTML file, or running your web browser with JavaScript disabled).

To get access to mentioned content you can have a look at the Selenium Nodes which address exactly this kind of webpages and web app:

See also:

For any questions don’t hesitate to get back - this forum is actually about Palladian and Selenium :slight_smile:

–Philipp

Thank you for Help! Now, i have new problem :slight_smile:

I create workflow for get data from different urls that are in table creater (Please check pic1)

I managed to get the data I wanted from the url (“Table Creator” row1)
However, i need working loop in order to get data from second url.
Second url is in the “Table Creator” row1.

But, my loop workflow turn row0. How can deal with this problem.
in summary, i want to get data by running rows in table creator one by one

I attached my workflow
Pic1
Enkesa_Selenium_3.knwf (38.0 KB)

Could you please help me?
Best regards
Hamza Akin

Hi Hamza,

for this, you’ll want to use the looping nodes, which allow you to process your input step by step. I’d go for this combination:

With this, your input is converted to flow variables. Use the flow variable as input to the Navigate node to set your navigation URL. You can terminate the loop using the following node and collect your results:

Hope this helps!

Best,
Philipp

1 Like

Thank you very much dear @qqilihq!

It worked!

Best Regards from Istanbul :slight_smile:

1 Like