Pattern Count - Bioinformatics

Greetings;

I am learning Bioinformatics and I am going to use Knime for solving Computational problems as I want to focus on the concepts instead of programming and I am familiar with Knime, do you think it is a good idea ?

My first problem is that I want to know the number of a pattern count in DNA sequence, for example I want to know how many times the pattern ACTAT mentioned in the string ACAACTATGCATACTATCGGGAACTATCCT, the answer shall be 3. As you can see there is no spaces between letters at all ! is there a node which can perform the pattern counting task ?

There are cases which I want to know the most frequent pattern consists of 4 letters for example, how can I do that ?

Thanks in advance

Regards

1 Like

Greetings @Mutaz

The following thread solves a very similar problem:

Hope it helps.

Best

Ael

1 Like

Hey

Thank you for your help, but it seems that always the result of the workflow is less than the actual result, maybe it does not count the overlapping ones, is there any other way for counting or can we develop that workflow to read the overlapping sub-strings too?

Regards

Hey

I built a workflow and uploaded it on Knime hub in the following link: Pattern Count - Bioinformatics – KNIME Hub

The workflow solves the overlaps issue and include them in the count, I am sure it can be better as the user has to enter 4 inputs to have the answer

Regards

1 Like

Hello @Mutaz,

here is a loopless (faster) and no input needed workflow example following your logic:
Pattern Count - Bioinformatics_ipazin.knwf (44.2 KB)

Br,
Ivan

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.