Using cookie retrieved with HTTP Retriever node doesn't work in workflow, however copy & pasting exact same cookie from web console does.

I’ve made a small demonstration workflow which shows that when the cookies are retrieved dynamically, the flow is not successful in requesting the webpage, and responds with status code 500. However, when the cookie is hardcoded (copy & pasted from the web console) it works perfectly and receives status code 200 in response.

The cookies are identical so I don’t understand how/why the HTTP retriever node can differentiate between the two, because I certainly can’t.

I’m wondering is this a limitation of the HTTP retreiver node? Or some kind of magic cookie?

Faulty_Cookie_Example.knwf (15.4 KB)

Not a magic cookie :slight_smile: But: When executing the first node, the console says:

WARN  HTTP Retriever       3:932      Invalid cookie header: "Set-Cookie: bm_sz=FFDB98FE7B4026F62FB7D54059BB1C85~YAAQ9MTdWOoSQrN0AQAAFHosxgk0V5LGBBCw1jXht7l6VImJbVOZgurF0EPxuvYWfbKeVQsQ+zHgPI2/kb24FTYEJgMKpVAoPiiwwnB5eBtuCZckRhj6TcPCn1JWS7QCA9O4+ZxTIfsrUxWtILrtmuoMyOwcwH1hMcWKpCBUYMQXwq7qmlu93cofNcp9DBjrc6C+RPvHUQ==; Domain=.danmurphys.com.au; Path=/; Expires=Fri, 25 Sep 2020 20:50:36 GMT; Max-Age=14399; HttpOnly". Invalid 'expires' attribute: Fri, 25 Sep 2020 20:50:36 GMT
WARN  HTTP Retriever       3:932      Invalid cookie header: "Set-Cookie: _abck=9BE50A311AF93904A43FC2E6968FC6A1~-1~YAAQ9MTdWOsSQrN0AQAAFXosxgSP4rer1IN6dbrVszeNJ1Y0HgVpW3Yba7PBaDlzlz8DbVXt/oByu6cWtLF/xG5xQV5j5biEtihMXnJ0AzpYzUErjda+EZ5ViOThZ4oATPPXki5YRRpWZa1JR74yfSo5Lweq3xK8WL07L9IG/QRvP7jXbCQyiYDavx60R/NbioSD2obhCh7IUaabtzplQPJbHkp1cyJExJKiNJqiWrk+pNUNot3ynai4f2g0Nzl3uGHQjDImIAcdwzLeKqcXG6lHS5VWtqH7lFVmVnL5UOT9d5jTpnul4YtdiswxNmrGeQ==~-1~-1~-1; Domain=.danmurphys.com.au; Path=/; Expires=Sat, 25 Sep 2021 16:50:37 GMT; Max-Age=31536000; Secure". Invalid 'expires' attribute: Sat, 25 Sep 2021 16:50:37 GMT

I am currently admittedly not 100% sure, why we do not parse this one. Most probably the date format is not following some RFC spec? On the long run, we should definitely make the parsing more lenient, I think – I’ll keep a note, but cannot promise a quick turnaround.

In the meantime, a workaround would be to parse this data manually, by taking it from the HTTP headers. You can get them with the HTTP Result Data Extractor:

Hope this helps for now.

–Philipp

[edit] Format looks fine according to RFC. Currently not sure why this happens?!

1 Like

I tried getting the cookie with Python, however the issue persists. V strange.

Python workflow enclosed.

Faulty_Cookie_Example.knwf (18.4 KB)

@Nancyjay I was able to fix this in the Palladian library. If you’d like to give a pre-release version a test drive during the next days, please get in touch with me at mail@palladian.ws

Thanks!
Philipp

May I ask what changes you have made to the Palladian nodes? We use them quite frequently so it would be useful to know to prepare for such changes. :slight_smile:

Also thank you for your timely response and action.

Regarding changes: You can find the official change log on the NodePit page – consider this the official channel regarding all Palladian updates:

Beside that, I’ve always tried to give a quick wrap up here in forums, e.g. here:

https://forum.knime.com/c/community-extensions/palladian-selenium/34

– Philipp

1 Like

[edit] Sorry, here’s the proper link:

1 Like

Same issue here. Updated to 2.3 on Knime 4.1 but the issue still persists.

It’s not yet fixed (except in the mentioned pre-release versions.)

–P

OK because of your references to the changelog I assumed its already there. Made a workaround with a java snipped but would be great to have these nodes working again. Based on your gitlab code you are going to switch to the HttpClientBuilder right ?