Universal Cookie Table for HttpRetriever Node

Is it possible to pass a set of universal cookies (cookies in a table not tied to a domain) to an HttpRetriever node?

In my crawling process I first get a set of seed cookies from www.mydata.com. I then pass these seed cookies to a second HttpRetriever that retrieves the data I need.

When I pass the cookies as a string in the "Cookie" column of the top port of the HttpRetriever and add the "Cookie" column to the list of Headers to send, everything works great.

But when I pass the cookies as a table to the bottom port (the Cookies port) of the HttpRetriever everything stops working. Or at least it stopped working this last time I tried to use it - I've used the bottom Cookies port a lot in the past and never had a problem.

The problem is that the seed cookies were associated with the Domain "mydata.com" (I've also seen ".mydata.com" which confused me) but the URL I was crawling was subdomain.mydata.com.

The solution was to edit the Cookies table and change the Domain column to be "subdomain.mydata.com". But a set of universal cookies that were always passed to the URL (just like the put-all-the-cookies-in-a-long-string example above) would be helpful. Perhaps by passing an empty Domain, or perhaps through an advanced option "pass all cookies", or perhaps there is already a way that I'm not yet aware of?

Not a big issue but it might save somebody a few hours of head scratching.

Hi Edlueze,

I understand your issue and I'll see if there is some easy way to implement this. However, as we're partly relying on a 3rd-party library, I'm not yet sure whether this is possible in general.

I'll keep you posted.

Philipp

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.