Remove everything after Nth character

I have an Excel file that I’m trying to extract the root domain from the full URL. The URL column has the full URLs roughly in the format:
http://www.domain.com/directory/page

I need to get it down to:
http://www.domain.com

The URLs are a mix of http and https. The length of the URL and the number of forward slashes also varies. So the most reliable way seems to be to cut off everything at the third forward slash. I’ve been trying to work with the Expressions Editor in the Column Expressions node, but I can’t find any functions that give me the location of the Nth appearance of a character. What would be the best way to remove everything from the third forward slash to the end of the string?

Hi @Data1981

Mu choice would be the Cell Splitter node, and split the url on the forward slash. Eventually you can limit your number of output columns by setting the Set Array Size to 4.
Schermafdruk van 2023-10-06 21-08-33


gr, Hans

1 Like

Hi, @Data1981

My choice is string manipulation node with formula
substr($url$,0,indexOf($url$, “/”,8))

Regards
Hugo

3 Likes

Hi @Data1981 , there are multiple ways to do this.

In addition to the above methods, you can also use the “String to URI” node with the “Extract URI Info” node to achieve what you are trying to do.

3 Likes

Thanks everyone. I wound up going with regex after several tries:

regexReplace(column(“news_attachment_name”), “^(([^/]/){3}).”, “$1”)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.