String Manipulation

#1

I have a data set as follows

6310DX_34782_3519
34782 - Wireless
6310DX_40235_0524
6310DX_38657_3164
38657 - Wireless
6310DX_38064_3004

I need to remove the last 5 characters only for those items that END with (underscore and four numbers) (leaving those items that do not match the previously mentioned criteria untouched)

I tried with a String Manipulation as follows:
substr($RIMINSTANCE_NAME$, 0, (length($RIMINSTANCE_NAME$)-5))

What this does is to remove the last 5 characters regardless of what those characters are.
Any suggestions of what the syntaxis must be for this “string manipulation” node to work properly?
Maybe this can be done as well with a “Rule Engine”.
Thank you.

0 Likes

#2

This sounds like a job for regular expressions! I’m not a RegEx wizard, but several in our community are - for example, @armingrudd :slight_smile:

3 Likes

#3

Hi there @Barajas,

welcome to KNIME Community!

For this you can use couple of formulas in String Manipulation node but I would probably use regex if I new how :smiley:

reverse( substr( reverse($ID$), indexOf( reverse($ID$), “_”) + 1) )

Br,
Ivan

2 Likes

#4

It’s not elegant but you could use this regex in a regex split node:

(.* - .|.[0-9][0-9][0-9][0-9][0-9])??.*

image

image

@ipazin’s example needs fewer nodes as you don’t need to tidy up the split columns like with the regex approach.

Cheers

Sam

2 Likes

#5

Hi @Barajas and welcome to the KNIME forum,

Here is the expression I suggest to use in the String Manipulation node:

regexReplace($RIMINSTANCE_NAME$, "(.*)_\\d{4}", "$1")

:blush:

4 Likes