Use of regexMatcher function in different nodes

I’m using column expressions node and string manipulation node in order to get true or false if text in a column match a regular expression, however I get this error using column expression node: “An error occurred during the execution of the script: Unclosed group near index 20”, in the other hand String Manipulation node don’t get error but does not provide the expected result. The regular expression works well here: https://regex101.com/r/1444Hh/1
I know that regular expressions should match all line, but I don’t understand what I’m doing wrong.
I give an example in the attached flow.
regex_match.knwf (17.4 KB)

Hi @isra4884

The issue here is that in order to use Regex in KNIME properly, you need to escape special characters. The quickest way to generate a Regex function that is compatible is using the Code Generation function on Regex101.

Click on Code Generation, select Java and copy the content of final String regex

In your case, the original function (^G:.+)(\\)(.+)(.pdf$) is transformed into (^G:.+)(\\\\)(.+)(.pdf$) whereby the backslash is escaped. If you then copy this to either the Column Expression or the String Manipulation, you are able to generate valid results.

Hope this helps!

5 Likes

It works!!! thanks for detailed explanation!

2 Likes

You can also set your default code generator to Java as well in the settings if you are a heavier Regex user.

It would be awesome if there was a shortcut to just copy the Final String Regex from the Generated Code right in the expression build / test area…

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.