Split string sequence with Regex Split

Nico1990 · October 18, 2013, 2:51pm

Hello,

I have this kind of protein sequence: YLLLEPYFAVWY in a column.

I would like to have each amino acid (letter) in distinct columns using the Regex Split node.

YLLLEPYFAVWY

Y

L

...

Unfortunately, I do not know the regex that does that.

With ([A-Z]{1})(.*$) I get:

Y	LLLEPYFAVWY

which is nearly what I need, nearly...

Does anyone know the right regex?

Thank you

Nico

richards99 · October 20, 2013, 11:50pm

Wow, not easy. As I understand, regex cannot create automatic capturing groups, which is what you want. You will need to do ([A-Z]) repeatedly, finishing off with a .*

If someone knows a way, please share. Hardly ideal. A better way is...

Use string replacer node, choose regex pattern and put in ([A-Z])

In the replacement text, put $1, This will now separate all letters by a comma. Now use a cell splitter node, and use , as the delimiter. And specify to replace all occurrences.

Job done.

That caused me a lot of head scratching.

Simon.

Nico1990 · October 21, 2013, 4:37pm

Hello Simon,

This is wonderful! Thank you very much!

However, for those who would read us, the pattern is ([A-Z]){1}, and the replacement text is well $1, .

It seems this head scratching exhausted you ;)

NIco

ScottF · March 27, 2023, 9:18pm

2 posts were split to a new topic: RegEx Split Question