As I am not yet very experienced in Knime I have a problem with the regexmatcher function in the String Manipulation node which I cannot solve. I would like to match two columns in two tables that look like this:
So, Table1 contains names and institutions ((faked) company, city and country, Table 2 includes some (faked) city names. I would like to select the correct city names from the institutions.
In my workflow, I use a Cross Joiner to join the values of the two columns. After this I use a String Manipulation node with the regexmatcher function and after this I used a Row Filter to select the “True” values.
In line 1, 3 and 6 the regexmatcher has selected parts of a word and the result is an error. How can I avoid this error? How can I manipulate the reqexmatcher in a way that it selects only whole words and not parts of it? I have experimented a bit with the \b operator but as I am really no RegEx specialist, results were even worse. Please, can you help me? Thank you in advance.
The word boundaries are, as you said, “\b” in regex, but to work with String Manipulator you will hit problems if you don’t “escape” the “\” in the string, so for each \b, you need to place an extra “\” in front:
btw, when posting the code, to ensure that the forum doesn’t convert the double quotes in the code into “smart quotes” (and also have it display every \ correctly, highlight the code and click the “preformatted text” button on the forum message toolbar
i.e.
Dear takbb,
thank you very much for your support, this solved my issue. And also thank you for the tip concerning the display of code, I will keep in mind.