Extract regex from URL

So I have a list of URLs where I need to extract the Discussion ID, with varying lenghts and properties.

https://www.domain.es/forum-radom-subforum-name/discussion-name-75427-post2571462.html#post2571462

I’ve came up with this regex \d++(?=-), which is the data (75427) I want to exactract.

Now, how I do this? I can easily do it with PowerQuery and Column by Example but I can’t find a way to do it in knime. Tried with rule engine but I don’t get any results.

Hi @iagovar,

Regarding your example, you could use this expression in the String Manipulation node:

regexReplace($column1$, ".*-(\\d+)-.*", "$1")

Where “column1” is the name of the column containing the URLs.

:blush:

1 Like

Thanks, it worked.

If I wanted to use \d++(?=-) how should I encapsulate it?

2 Likes

Not sure what exactly you are talking about but I am sure you can find your answers about regex here:

https://www.regular-expressions.info/

:blush:

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.