REGEX like syntax for STRING MANIPULATION node?

Hi,

Is there a regex like syntax that can be incoporated with the STRING MANIPULATION node?  I am only interested in string position at this point, although any other information would be helpful.

For example, in standardizing data I need to change/remove abbreviations.  Therefore, I would ideally like to replace($column$, "CO", "COMPANY"), but this will replace "CO" wherever it occurs.  Therefore, I would like to use a statement more along the lines of ( CO)$

Is this possible?  Should I be using a different node?  Thanks in advance,

Mark

Hello Mark,

 

You may want to look into the Java node and use $column$.replace(" CO", "COMPANY");

 

HTH,

Fred

Fred,

Will that focus the replace on discrete strings containing only the characters "CO"?  The problem I am having is defining the end of the string element as the end of the string.  Otherwise, I end up replacing strings I do not intend to replace.

Thanks with your help thus far...I will go try and see what I get!  Otherwise, I have a partial work around, but it is not that elegant.

Thanks again,

Mark

You may have a look at the documentation of the String class for details.

Hello,

I know the post is quite old.

But how to replace all punctuation characters (the regex class is \W)?

I created a java snippet node with the following code in the "code area"

c_ABSTRACT.replaceAll("\W", "");

i always get the error

"Invalid escape sequence (valid ones are  \b  \t  \n  \f  \r  \"  \'  \\ )"

Why is this happening? The \W expression class is supported by java.

My goal is to remove punctuation from a dataset with 4 columns, where each column contains a text (not just a string).

I'm just starting using KNIME.

Thanks

In Java, you should escape the \:

c_ABSTRACT.replaceAll("\\W", "");