html parser. Some elements missing.

Hello,

I’m tring to get the results of a table in an html

web page. I used the palladian html retriever and parser. I have the Problemthat some elements in the parsed html source are missing. in the picture you can see the difference between the parsed html in KNime and the original one from google Chrome.

Do you know if there is a solution or an ylternative way to get this table ?

Thank you,
Roberto

here the workflow too.KNIME_html.knwf (12.0 KB)

Hi,
can you try with the Webpage Retriever node instead? I have used that one successfully before.
Kind regards
Alexander

2 Likes

The mentioned parts are generated through JavaScript, so you will not get them with the HTML Parser (or any other static parser). For alternative approaches, I suggest to check the following link (even though the website is probably not AngularJS-based, the same principle applies):

– Philipp

5 Likes

Hi Philipp,

thank you for the suggestion. May be it works but I have a Problem with the Chromedriver. I can’t change the path on our Enterprise Computer so it doesn’t work.

May be you have a suggestion where can I copy the chromedriver without changein the path so that it works with selenium ?

thank you,
Roberto

Hi Roberto,

it shouldn’t be necessary to change the path to the chromedriver, just leave it empty and the plugin will use the included binaries which are shipped with the installation (to clarify, the Chromedriver is not Chrome itself). Just make sure that Chrome is installed on the system.

If this is not possible, you can install the optional “Selenium Nodes for KNIME: Chromium” (needs to be explicitly selected during install). This will install an embedded Chromium instance which you can select in “WebDriver Factory” node.

In case you’re still facing obstacles, let me know!
Philipp

Hi Philipp,

many thanks, With Chromium it works! Thank you.

P.S.:The standard Chrome installation didn’t work because of our Enterprise restriction. Error: “Loading of unpacked Extensions is disabled by the administrator”.

This ma help someone with the same “Enterprise” Problems.

Roberto

1 Like

Great to hear it works with the included Chromium. Thanks for the feedback!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.