Looking to complete many regexReplaces in one column via one node

Hi, I am still learning KNIME and don’t know if what I want to do is possible.

I want to use the regex_replace function as there are many variations on the string(s) I am looking to find but for each regex_replace I want a new value. Each regex is unique so there is no conflict of data possibility being valid for multiple expressions.

Examples: regex_replace($[“Call Number Class”],‘(0m!\\b4[0-4]\\.?\\d?\\b|0m"\\b21[7-9]\\.?\\d?\\b|0m"\\b22[0-3]\\.?\\d?\\b|0m#1013|0m#1113)’, “violin”)

regex_replace($[“Call Number Class”],‘(0m!\\b12[5-9]\\.?\\d?\\b|0m"\\b27[6-7]\\.?\\d?\\b)’, “guitar”)

Currently, I have tried the Expression node where I have about 15 total expressions in the same node. The first expression in the node creates a new column to put in the new value (while retaining the data that was not replaced). The subsequent expressions in this node were set to replace this new column, with the intention that all of the subsequent regex_replaces inherit the changes that were made from the previous regex expression(s). I have confirmed that the regex expressions I am using do work but having this many strung together in this node doesn’t seem to work, as the data from the earlier regex_replaces are not retained. I have also tried using separate Expression nodes where the first one creates the new column and the second Expression node has the remaining 14 regex_replaces and it is to replace the column, but this is also not working as anticipated as the even the changes from the first Expression node was not carried forward.

From what I can tell the Expression Node might allow for this type of processing in one node, whereas the String Manipulation and the String Replacer nodes seem to only allow for one regex per node.

Do I need to use multiple (i.e. 15) nodes to achieve this goal and if so which node would be best suited? Or, is there something else I can use in the Expression node to make this find and replace work? Am I missing something obvious (as a non-programmer)? I have tried searching for an existing example but my search results don’t seem to answer this question.

Thank you for your help!!

Becky

not too deep into all the regex ins and outs but a few options:

  1. nesting is something I wouldnt do, just because it gets ugly quick.
  2. best performance will be if you merge all those regex into 1 and use capture groups. you may not be able to work with replace in 1 step, but regex matching and a subsequent (dictionary) replace should work
  3. cleanest, but worst performance, will be having your regex and your replace statement in a table, and then do a loop, where each loop generates a new column (column name = currentIteration number). afterwards, you use the column aggregator (and you can also build a safety check by not only aggregating the columns but also counting, to make sure you do not have 2 regex matching)
1 Like