Extract 1 or 2 digits

Hi,

I want to extract information about minute durations (just the numbers) which is stored in a column with some unnecessary text, characters and other stuff like “25 ABC <> 35D ABC - 10 min”; and the unit of measurement varies from “10Min” to “4 Min”, “27min” etc.

I tried to use regex split with .(\d\d)(?:min|Min| min| Min). which returns all the 2 digit numbers in front of min/Min/ min/ Min (great success) but obviously not the 1 digit numbers.

However, when I use .(\d+)(?:min|Min| min| Min). or .(\d{1,2})(?:min|Min| min| Min). oder .([0-9]+)(?:min|Min| min| Min). it only gives me the 2nd digit (0 instead of 20) and so on.

HOW can I get all the 1 digit numbers AND 2 digit numbers with this split? I googled all the expressions for digit expressions but none of them seems to work so there must be a mistake somwhere else I guess.

Thank you in advance

1 Like

Based on what you wrote I assume the problem is the leading . .Not sure if you need it, but if so replace it by [^\d]* (not a digit). Hope the syntax is correct.

Best
Mark

5 Likes

Hello @peppapiglet,

this happens cause regex is by default greedy. To make it lazy you can use *? quantifier after leading dot.

This one should work in String Manipulation node:

regexReplace( $column1$, "(?i).*?(\\d{1,2})(?:\\s?min).*", "$1" )

And this one in Regex Split node (there you have case case insensitive check box and don’t need escaping):

.*?(\d{1,2})(?:\s?min).*

The only difference is when there is no match. String Manipulation will return string as is and Regex Split will return missing value.

Br,
Ivan

7 Likes

That solved it for me!

Thank you both so much!

Best

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.