Replace specific characters in string according to reference

Hey all,

I am a total beginner in using KNIME and hope to find some help here. I am having troubles with replacing a specific character in my string with a character given by a reference string. In detail: I want to replace all ā€œNā€ in my text by the according letter (G, A, T or C) which is in the reference text at the same position, but all other letters should remain unchanged.
Here an example:
knime

What I have tried so far is:

  1. String Replacer - separate each letter of string with ,
  2. Cell Splitter - split every letter in separate column by , as delimiter
  3. String Manipulator - replace in split column every N with correct letter

It works in principle, but I have to do the String Manipulator for every single letter/split column and of course I have to determine the correct letter according to the same position in the reference by hand.

Is there an option to make this easier or without the need to address every single letter in my string in a separate step and to enter the correct letter manually?

Many thanks in advance for your help!

Best,
Meli

Hi Meli,
While I believe this can be done with plain KNIME nodes, I also think it is more efficient to use a Column Expressions node with some light coding. Please find an example workflow for your use case attached.

The code is the following:

column("Sequences")
.split("")
.map(function(c, i) {
    return c == "N" ? variable("Reference").charAt(i) : c;
})
.join("")

In the second line the sequences column is split up in characters. To each element in this array we apply a mapping function (c is the character and i its index). The mapping function checks if c is equal to N and if it is returns not c but the corresponding character from the reference string. If c does not equal N, it is just returned. This yields an array of characters with the N replaced by their counterpart in the reference. Using the join("") function we add all characters back together into a single string. The ā€œā€ is necessary here because the default separator is a comma.

Replacement.knwf (11.8 KB)

4 Likes

Hi Alexander,
Thank you very much for your fast reply!
I tried your suggested workflow and it works perfectly :slight_smile:
Thank you very much for preparing the example workflow and your nice and detailed description (it helped me a lot to understand the coding).
Best,
Meli

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.