Hi @Heinz, I’m not convinced that the VBA code you have given would return null for “def d5 abc”, since the trim of the two chars to the left of abc would give “5”, and there is nothing in that vba code to check if the 5 is preceded by a non-numeric.In your examples you say it should return “missing” rather than 5.
I would also never have guessed your expected outcomes for the “12” values based on your earlier example, but I can try to adapt for the required outcomes ![:wink: :wink:](https://forum-cdn.knime.com/images/emoji/twitter/wink.png?v=12)
If you are ok doing this with some code you can achieve it with String Manipulation as follows:
toInt(
regexMatcher(strip(substr($column value$, indexOf($column value$,"abc")-3,
indexOf($column value$,"abc")>2?3:indexOf($column value$,"abc"))),
".?\\d{2}|\\d{1}"
)=="True"
?strip(substr($column value$, indexOf($column value$,"abc")-2,
indexOf($column value$,"abc")>1?2:indexOf($column value$,"abc"))
)
:toNull("")
This applies all the rules that I’ve so far ascertained from your examples. It looks at the three characters prior to “abc” to determine that they are only either whitespace or numeric, and then if they are, it returns the trimmed two characters to the left of “abc”
If you want to try the new Expressions node (KNIME 5.3 onwards), the following expression is the equivalent of the above String Manipulation. You can see significant similarities in the structurem although the keywords have changed, and the other important difference is that the index of characters in a string start at 1 for Expressions, whereas they start at 0 for String Manipulation, so you will see there is a small adjustment:
parse_int(
if(regex_match(strip(substring($["column value"], find($["column value"],"abc") - 3,
if(find($["column value"],"abc")>3, 3, find($["column value"],"abc")-1) ) ),
".?\\d{2}|\\d{1}"
)
,strip(substring($["column value"], find($["column value"],"abc") -2,
if(find($["column value"],"abc")>2, 2, 1) )
)
,empty_to_missing("")
)
)
For a “no-code” version, an alternative is to use StringSplitter (Regex) with the following pattern:
.*?(\d{2})abc.*|.*?\d(\d)\sabc.*|.*\b(\d)\s?abc.*
This then returns three columns (one for each captured group of digits, according to the pattern found).
The three columns can then be merged using a Column Aggregator or a pair of Column Merger nodes.
In the attached workflow, I have created a flow variable called “pattern” in the Variable Creator, and this is then included within the code and the regex using the variable value rather than the “abc” literal, to make it more flexible.
Find numeric in regex.knwf (88.8 KB)