Thanks for that explanation @qqilihq! That makes sense about what kinds of situations I should be more careful in.
I’m getting a new error “Execute failed: (“StackOverflowError”): null” on the HTML parser. Do you have any idea how I can fix that? Or where I can go to find out what different error codes mean? Also, I’m not sure on thread etiquette. Perhaps I should start this in a new thread, as it’s fairly unrelated to my original question.
I’ve isolated the issue to 4 URLs in my current list that are causing the problem. They are all PDF documents, but none of them have “pdf” in the URL. They do all have “View” or “Preview” in the URL, so I could filter by that, but that feels like I could also exclude valid pages that way. Do you know any more elegant solution that could help me exclude these kinds of results in the future, before I try to use the HTML parser?
http://www.pilotpointlibrary.org/DocumentCenter/View/2281/2017-2018-CAFR
https://neptunebeachfl.civicclerk.com/web/UserControls/DocPreview.aspx?p=1&aoid=33
http://www.garlandtx.gov/DocumentCenter/View/5526/0724-Fiirefighter-Recruit?bidId=
http://bonnieandclydedays.org/AgendaCenter/ViewFile/Agenda/_04082019-764