String Manipulation

I have a sentence and I must eliminate the excess of words that convey a pattern.

GRILLS > Electric, Gourmet Sale > GRILLS > Electric

I must eliminate Gourmet Sale > GRILLS > Electric and keep only
GRILLS > Electric

I can delete the word but I need the text manipulation to be for everything that is consecutive to the detailed word. With * it cannot be done.

[image]

Can you give more examples?

2 Likes

Have you tried the Cell Splitter node? Split at comma; output as list and then ungroup the list. Or simply save as “new columns” and remove the one you don’t need.
wf.j
Output

6 Likes

Hello @julianLago123 and welcome to the KNIME Community

I think you can solve with a single ‘String Manipulation’ node using regexReplace(). There are two different approaches for your convenience (relying in your real data); depending about your capturing group:

  1. Removing trailing text after the comma, similar to cell splitter by @rfeigel; and keeping the comma’s leading text:
    [GRILLS > Electric], Gourmet Sale > GRILLS > Electric
regexReplace($text$, "(.*),.*", "$1")
  1. Keeping the latest two items in the hierarchy:
    GRILLS > Electric, Gourmet Sale > [GRILLS > Electric]
regexReplace($text$, ".*>\\s+([\\S\\s]+>[\\S\\s]+)", "$1")

BR

4 Likes

Thank you very much for the help, I was able to solve it with both tips and some more practices that I was researching.

2 Likes