Identify prefix in a column

Hi,

In my table, I have a column which contains sample names. These sample names have a common prefix which varies from one experiment to another. I need to remove this prefix. Is there a node which can identify that prefix for me?

Thanks for your help,
Regards,
Claire

hi @Claire, is there any common rule regarding the prefix?

  • number of characters at the beginning of the string?
  • delimiter β€œ-”, β€œ_”, " " or other?
    If so, the String Manipulation node would be the solution.
    If not, it gets more complicated (Cell splitter, Constant Value Column Filter,…)

Greetz, Tommy

Hi Tommy,
Unfortunately no. I thought about using a delimiter but users are not following consistent rules and the same delimiter can be used both in the common prefix and then in the remaining part of the sample name.

Cheers,
Claire

What is the length of the prefix. Is it always the same?

From one experiment to another, no. But within the same column, yes.

Cheers,
Claire

1 Like

hi @Claire

so you have to generate all possible prefixes as substrings and check if the value column starts with the corresponding substring. Then you have to choose the minimum of the matching substring. That would be your prefix.

Please have a look at the following HUB workflow:


In my example the max. length of the prefix would be 10 (=number of loops). You may increase or decrease that number. You have to re-run the prefix simulation (top part) for each of your experiment.

Hope that helps, Greetz, Tommy

4 Likes

Hi Tommy,

Many thanks for your example. It’s great.

Cheers,
Claire

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.