Do you want to replace the complete contents of a cell with another string or really parts of the string? If the former is true the Dictionary Replacer may be more helpful than the String Replacer.
this is exaclty my problem, I'd like to replace some 10-15 different expressions within a cell (not substitute the whol cell e.g. with the cell replacer node).
In my case I do have a column with multi-word expressions and would like to do the above transformation (to get rid of GmbH/ AG/ & Co./ & Co. KG [...] and the manifold misspelled ways of those ;).
to apply a multiple string replacement you can use the R-Snippet node.
Here is an example of the R-Code:
install.packages(“stringr”)
library(“stringr”)
knime.out <- knime.in
knime.out$“Column1” <- stringr::str_replace_all(knime.in$“Column1”, “pattern to replace_1”, “replacement_1”)
knime.out$“Column1” <- stringr::str_replace_all(knime.in$“Column1”, “pattern to replace_2”, “replacement_2”)
…and so on.
If you like to append the entire string that contains the replaced string as a new column:
knime.out$“Column2” <- stringr::str_replace_all(knime.in$“Column1”, “pattern to replace”, “replacement”)
I think, for a manageable amount of string replacements this could be a feasible solution. In case of large amounts it would be smarter to build up an dictionary and apply it by looping the dataset. Unfortunately you can’t use the KNIME node “String Replace (Dictionary)” because it does’nt support the replacement of substrings, so you have to use a workaround like the R-Snippet.
An additional experience:
If you want to replace substrings that contain patterns of multiple dots like “banana…” for instance, you have to be aware, that stringr::str_replace_all uses RegEx. A dot in RegEx representing a wildcard. To escape the widcard funktion of multiple dots just add square brackets like this:
I personally prefer the Java Snippet (simple) node for this task.
It’s the most intuitive implementation I have seen so far.
The code used in the Java node is easier to understand than in the R node, AND you don’t need to install an addon (and a local R installation) in order to use it.
The node content looks a little bit like this:
Be careful with characters that need to be escaped with a \ for the replace() function to work.
There is a list of all such characters somewhere in the middle of the following page: Escaping characters in Java | CodeGym