Hi @WadeLo , @lelloba and @ScottF
Wow, this is a weird one…!
All the more so because using regexReplace in both Column Expressions and String Manipulations is what I use as my go-to method of putting newlines into text!
When done as a literal (i.e. typed directly in the node config) , the following works fine:
regexReplace(column("text"),"(.*)\\s(.*)","$1\r\n$2")
I don’t normally bother with the carriage return, just the line feed, but I know some apps like the “windows” CR included.
regexReplace(column("text"),"(.*)\\s(.*)","$1\n$2")
But try as I might, I couldn’t coerce it to this by populating the table:
In the end, I decided that the only way I could find to handle it was to wrap it in an additional regexReplace:
if(regexMatcher(column("text"),column("matchRegex")))
regexReplace(
regexReplace(column("text"), column("findRegex"),column("replaceRegex") )
,"\\\\r\\\\n","\r\n")
with the original regex table as this:
The result was this:
You’ll note the “inconsistency” of the double escaping required here. Not easy to predict exactly what escapes will be required!
If all those backslashes are a bit much, an alternative is to hack it slightly and have the RegexReplace string in the table replacing with different characters, e.g. ¬¬
and then use a wrapper regexReplace to replace "¬¬"
with "\r\n"
like this:
if(regexMatcher(column("text"),column("matchRegex")))
regexReplace(
regexReplace(column("text"), column("findRegex"),column("replaceRegex") )
,"¬¬","\r\n")
btw, in the uploaded workflow, the original Column Expression code uses the match()
function but that is replaced here by regexMatcher()
. The match()
function isn’t listed as an available function, so I think regexMatcher()
should be the correct function to use… But actually the condition isn’t required here, since regexReplace only performs a replacement if the expression matches, so the following would work equally well:
regexReplace(
regexReplace(column("text"), column("findRegex"),column("replaceRegex") )
,"\\\\r\\\\n","\r\n")
or (from the alternative above)
regexReplace(
regexReplace(column("text"), column("findRegex"),column("replaceRegex") )
,"¬¬","\r\n")
I’d say that somewhere in the translation of the column value into the function, some additional string handling is occurring (aka getting in the way!). It is certainly counter-intuitive.
My theory… based on the way \r and \n require only single backslashes when typed in the Column Expressions node, whereas other regex items require a second escape is that KNIME handles the typed entry of \r and \n as physical carriage return and linefeed as if they were being entered directly as those control characters. So when typed (or at least when the literal is parsed) in the Column Expressions (or String Manipulation) node, they are treated there and then as if they are actually CR and LF, whereas when they are pulled in from a data feed, they are not interpreted in the same way, and are just seen as the literal characters of backslash and r and n. So on that basis I think there is probably no way to pull those characters in from the data. The reason why the wrapping regexReplace works, is because it is converting a text string of literal back slashes and r and n into a typed (and therefore interpreted/parsed ) CR and LF. Just a theory…