Hi Dear Knime Users :)

I am trying to visit html  links and parse that data.

For doing that ,

I created a  excel source which has 10.000 url. and then Http Retreiver node ->Html Parser node   -> Xpath

here is my question. when i run workflow. my result set is changing all the time, some web sites return that i want but some of them retuning  statusCode=500 when i try to visit url on my browser i can get the results.

but  it takes more than 2 seconds to getting results. I thing this is the problem. how can i put some delays for every url respons for this node?

i tryed to change and increase Http Retreiver node's parameters but it didnt solve my problem.

Any idea ?

Thans for your interest :)

Hi bugra10ur,

first, you should determine, why you're getting those HTTP status 500 results in some cases (have a look at the error message will help). If it is indeed quota-related, putting a delay will likely help.

However, note that the timeout settings in the HttpRetriever have nothing to do with delays. The timeouts are used to cancel requests with a high response or network latency.

To implement a delay between your requests, make use of the loop nodes. Split up your URL input into single iterations using the Chunk Loop Start node. Within your loop, connect your existing HttpRetriever/HtmlParser/etc. flow, but prepend them with a node which performs a wait between each iteration. There are several nodes for waiting available now, as a shameless plug I highly recommend the Wait node from the Selenium Nodes.

You could also build some flow which would simply retry requests in case of error results. I already did it once successfully, but I can't find the specific workflow now to provide you with an example, sorry.

thanks :) your idea will help me.