I am getting stuck on using regular expressions in the String Manipulation node. This will be example of text analysis for new KNIME users. My limitation is that I need to use a standard KNIME node and not a language snippet, a Palladian node, or something similar.
The main idea is to detect “password,” “passcode,” or a similar term such as “passwd,” “passwrod,” or “pass code,” while excluding terms such as “passkey,” "pass " or “unsurpassed.” On https://regex101.com, I constructed this regex code that does exactly what I need by using the Java 8 flavor of Regex:
(\b)pass[ ]?[word]{2,4}(\b)|(\b)pass[ ]{0,1}[code]{2,4}(\b)
In the String Manipulator node in KNIME, my current configuration has this expression to match:
regexMatcher($Full Description$,
“.\bpass[ ]?[word]{2,4}(\b)|(\b)pass[ ]{0,1}[code]{2,4}\b.”)
I have tried different variations such as with/without .* and with/without parentheses around \b . My sample data set contains plenty of “password” and “passcode” string matches, but I only get False matches in the output table. What am I missing?
Thank you for your help!