Extract domains from URLs

Hi @kaimbaum and welcome to the Knime Community.

That’s a 7 year old thread :slight_smile:

First of all, there is no such thing as the “short domain” :slight_smile: What you are looking for is what’s commonly referred to as the domain name or simply domain, though this is also not entirely accurate (hence why I said “commonly referred to” - the proper term is the “root domain” or “apex domain”).

www.hello.com is a subdomain of hello.com for example, and asewa.asdhiasd.to is a subdomain of asdhiasd.to.

Looking at what @iCFO , the idea is good, but unfortunately not all domains follow this structure, so it will not work with all domain names. UK urls for example have domains that ends with .co.uk or .gov.uk (e.g: yahoo.co.uk) for example. Canadian urls also have this kind of domains, for example .qc.ca (e.g: etatcivil.gouv.qc.ca)

So, the proposed solution would not work.

Palladian offers a node called URL Domain Extractor that does exactly what you are looking for:

This is an example of the node in action:
image

Palladian is a very useful extension and offers a lot of tools. I would strongly recommend getting it. However, sometimes there are company policies that do not allow “external” extensions to be installed, and if that is the case for you and can’t get the Palladian extension, you can check what @takbb did there without the Palladian extension:

5 Likes