Several replaces using regexReplace in the String Manipulation node

Hi all,

I am trying to use regexReplace inside the String Manipulation node to find and replace conflicting protein names. For example:

regexReplace($Submitted entities found$,“NUP107;”,“X0X033;”)

However, I want to be able to replace more than one protein name using the same node. For example:

regexReplace($Submitted entities found$,“NUP107;”,“X0X033;”)
and
regexReplace($Submitted entities found$,“ATF2;”,“XOXO44;”)

But when I put both lines together the node gives me an error message.
Is there an expression or regex symbol that I can use between this two lines to tell the node to execute both lines of command?

Thanks in advance,

AG

Do you really need regex for that.
Maybe Rule Engine Dictionary with wildcard can help you achieve this

Hi @VAGR_ISK

I guess this new question is related to the previous one for which I provided a possible solution. If I understood well and based on this previous solution, I have implemented a solution here below attached.

It is based on a recursive loop because I believe you are obliged to iterate several times on the same string to gradually replace each different found Protein Name. If for instance, you have a sentence where you have 3 different Protein Names, you would need to gradually iterate 3 times in total until all the Protein Names are replaced.

The first part of the previous solution was essentially this one below:

As you can see, I have added a mock Dictionary of Protein Name Pairs, on the left the Protein Names in your sentences, on the right the new names to use as replacement.

The new added solution is as follows:

An the full new workflow is here below:

20210919 Pikairos Several replaces in the String Manipulation node.knwf (239.6 KB)

I would be glad to know if somebody else brings a more efficient solution implemented as a simpler workflow.

Maybe this solution is not exactly what you want to implement but I believe it must not be too far from it.

Hope this helps.

Best,

Ael

4 Likes

Hi @aworker,

it was not really so much related to the last question. I was using your first solution successfully after some modifications.

My question was more fundamental. I wanted to know if there is a way that the String manipulation node can actually read two lines of code that modify two different protein names at the same time. See the example attached in the figures


Thanks for your help,

AG.

Hi @VAGR_ISK

I’m not right now in front of my computer but answering from the smartphone.

To answer quickly, you can only have one line of code in the -string manipulator- node.

The only way to do a “double” regex replacement would be to imbricate the two regex functions in this way : regexReplace( regexReplace( ×, y, z), y’, z’)

Hope this helps.

Best,

Ael

2 Likes

Oh ok.

I wanted to be sure.

Thanks a lot.

My pleasure. By the way, my suggested solution here, regardless of the first part, is taking a dictionary of pairs of words and replacing them in your sentences. It is hence a generic solution of what you want to do with several regex replacements. In my dictionary, every pair of words corresponds to one of your regReplaces. You would just hence need to create your own dictionary based on your regex needs.

Hope this helps.

Best

Ael

Oh Ok,

I will check it tomorrow and come back to you.

Very cool community at KNIME…! Also, it is really a great program, but there are things that deff. will be easier if I will be a programmer ;).

Cheers,

 AG
2 Likes

You can still be a programmer when using KNIME. You can either use Java snippet nodes, if you are a Java programmer, or other integration nodes, like Python or R.

Hi @Experimenter,

I know that KNIME has all these integrations. However, my comment is more related to the fact that I am not a programmer.

For example, the Regex and PMM codes “languages” are extremely complex to me, but knowing the right syntaxis is crucial to do advanced data filtering.

I assume that programmers know Regex and PMM well and therefore is easier for them to handle complex data filtering than for me.

I will so much appreciate a Regex and PMM translator so that I can write what I need and the translator tells me the correct syntaxis.

Cheers,

   AG

In fact, at least in my experience few programmers know Regular Expressions well. :slight_smile: I would also not call regexes a programming (language), as it is just way to encode character patterns. A highly recommended one, as regexes is an ultimate tool when one needs to lookup, match, replace text contents.
Actually regular expressions are not so complicated if one understands what they consist of… :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.