while doing some research I happen to notice that the URL Domain Extract struggles to extract domains in the following cases:
- domain.co.ke or domain.com.mm
PS: Domains consisting of IPs is kind of an edge case which I have not included but it’s apparently not working either.
Thanks for bringing that to my attention. I’m aware of some other issues with the node already, which we’ll address together with this one in an upcoming update.
This is addressed in the Palladian Nodes v2.6 update. More details here.
and apologize for my delayed response. Family got sick and I had some urgent work items to address. Comparing the results they have significantly improved but there also some minor regressions present it seems.
Simple Test shows fixes
Minor regressions in this CSV (gzip in my Google Drive due to upload restrictions).
Thanks for the feedback. I’ll have a look at the regressions after holiday.
In the meantime, get well
Hope you’ve recovered well!
Again thanks for the detailed list. I’m currently trying to figure out if there’s any issues to fix. Could you clarify about the following cases:
There’s missing values in the last three columns e.g. for
HTTPS://SUB-DOMAIN.DOMAIN.UM. Does this mean they are not extracted correctly for you using the updated version?
There’s rows which have missing values for the first three columns (e.g.
https://domain.edu.al). Does this mean that it was not extracted properly before, but now works with 2.6?
Generally, the first three columns in the CSV resemble results pre-2.6, and the last three the results of 2.6 – is this correct?
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.