How to get a text value from a string

Hi All,

Good day! I’m new to knime and would like to get some help. I want to get a certain text value from a string. See example table below. I want to extract the text between the two dash and if the string dont contained two dash then the results can just be blank.

Thank you in advance

image

Hi @fair_man24 , welcome to the KNIME community!

Which two dashes do you mean, as your examples have three dashes? Perhaps you can take one example row, and show what value (or values) you would wish to return for it. thanks.

Hi @takbb,

Tahnk you for checking out my query. Please find table below. The results I want are those substring highlighted in yellow.

image

Hi @fair_man24 , you should be able to do this with a String Manipulation node, using regular expressions:

Give it the following:

regexMatcher($SPACE CHECK$, ".*\-.*?\-(.*?)\-.*").equals("True")
?regexReplace($SPACE CHECK$,".*\-.*?\-(.*?)\-.*" ,"$1" ):""

This checks if SPACE CHECK contains a pattern with a hyphen at the beginning and then two hyphens later. If it does, it returns the portion between the second and third hyphen. If not it returns an empty string.

I’m assuming the presence of the dash at the beginning of the string, but if that isn’t always there, then this would probably need to be modified.

@takbb, I entered the expression but the results it provide are all blanks.

Below is the column where the results will appear.

Hi @takbb,

The expression you provided did work! I thought it didn’t but upon checking I used the wrong node. I used the column expression insteam of the string manipulation.

Thank you so much till next time… have a great day.

1 Like

@takbb, Just curious, when I used your code in column manipulation node it did provide blank results unlike with the string manipulation node. Can you provide me as well the correct expression that I can use in column manipulation node. So that I don’t need to use another node since my workflow last node is the column manipulation.

Thank you so much again.

Hi @fair_man24 ,

Glad to hear it worked for you.

I’m not a big user of the Column Expressions node. Almost the only time I ever use it is in answering questions about it, :wink: but the direct equivalent of the code I gave above would be this:

regexMatcher(column("SPACE CHECK"),".*\-.*?\-(.*?)\-.*")
 ?regexReplace(column("SPACE CHECK"),".*\-.*?\-(.*?)\-.*" ,"$1")
 :""

and the more verbose version is this:

if (regexMatcher(column("SPACE CHECK"),".*\-.*?\-(.*?)\-.*"))
{
    regexReplace(column("SPACE CHECK"),".*\-.*?\-(.*?)\-.*" ,"$1")
}
else
{
    ""
}

Although with my old software engineer hat on, I’d probably go for this to avoid unnecessary repetition, and to make it more portable

pattern=".*\-.*?\-(.*?)\-.*"
columnValue=column("SPACE CHECK")
if (regexMatcher(columnValue,pattern))
  {regexReplace(columnValue,pattern ,"$1") } 
else
  {""}

Both should work. There may be a better/alternative way in Column Expressions, but as I say, I’m not a big user of it.

1 Like

Hi @fair_man24 .
Have a try on node cell splitter.

BR
Hugo
Ps: Please don’t write your tittles in uppercase :wink: