I need to extract a string when it is equal to a combination of a given character's type


I would like to extract from a column which contains a description, a string only when it is equal to a given combination of character’s type. For example, i want to extract from: “dsjbjdieEUUBFB 284847 8328HF83 BK540GG N.1000030” a string when corresponding to the following combination “LETTER & LETTER & NUMBER & NUMBER & NUMBER & LETTER & LETTER”, that is “BK540GG”.

I have no idea how to do this.

Thank you in advance if you can help!

Hi @mpoppi and welcome to the Knime Community.

You need to use some Regex to do this, and the Regex to match your rule would be this:

I’m not sure how to extract the value via Knime nodes. The Palladian extension has a Regex Extractor node, which can do this:



EDIT: You can get the Regex Extractor from here:


Thank you Bruno :slightly_smiling_face:

I’m trying with the “Regex Split” and it doesn’t work. The message displayed is the following: “Input strings did not match the pattern or contained more groups than expected”.

I’m going to try with the node you were sharing. I’ll let you know.

Thank you anyway :slight_smile:

Edit: my source is a Excel Reader node. Maybe this could be a problem (?)

@mpoppi it should be possible with the String Manipulation — NodePit
I don’t have a KNIME environment available right now so I cannot add a screenprint for you, so this is from memory.

Use as function
regexReplace($yourColumn$, "(.*)([a-zA-Z]{2}[0-9]{3}[a-zA-Z]{2})(.*)", "$2")

You’ll recognise as second group the regex given by @bruno29a. A group in regex is anything between these brackets ( ). Whatever is matched as a group can be used again in the replacement by $1 for the first group, $2 for the second and so on.
In the above formula the first and third groups are simply skipped in the replaced argument, leaving the second part you are looking for.


Hi @JanDuo , it’s exactly what I was looking for. I’m not an expert with Regex. In fact, I’ve only started writing regex on my own a couple of months ago, so I’m not too familiar with what $1, $2, etc are, but you pretty much explain what they are. It’ll take some time for me to get used to how to use them.

I initially tried to use the negate sign (^), thinking I could remove anything NOT matching, but that did not work.

Thank you for sharing.

A bit offtopic, but you can use the same regex replace logic when you want to rename columns using regex (Column Rename (Regex) — NodePit).

Thank you all guys!

I’ve tried the solution suggested by @bruno29a and it worked pretty fine, while the “Regex Split” node still doesn’t work and don’t know why.

So thank you so much Bruno, you really helped me :slight_smile:

Does it mean it’s solved or not? If so you could mark the solution for others.

