Finding rows by pattern

#1

Hello folks,

I joined the Knime universe right this spring at Berlin and I am currently digging into the whole thing.
And my problems do not seem to end. Found a lot of sollutions already, but for this one I do not get working.

I got a table with analyses for batches.
It looks like the following:

Date / Batch / Sample Number / Element 1 / … / Element n

I got seveal samples for each batch, usually looking like that:
1
2
3
14
15
20
21
22
31

The single digit samples are for my first unit, the two digits starting with 1 (14, 15…) are for the second unit, the 20s are for the third and 31 is the final sample.

Problem: They are not always the same. It could start with a 2 or 3 for example, have between 2 and 6 samples on the unit 2 and so on.
The first sample of unit 2 is made with the pattern β€œ(largest sample number unit 1) + 11” so after 4 follows sample 15.
Hope I made this clear.

And now the problem:
I need to filter the table for the following criteria:

  • first sample for each batch (lowest sample number).
  • first sample of unit 2 (lowest sample number >10)
  • last sample of unit 2 (highest sample number <20)
  • first sample of unit 3 (usually 20 but make it lowest sample number >=20).
  • final sample (31)

Hope someone of can help me.
I guess the sollution is simple, but somehow I do not get that working for me.

Thanks in Advance!

Regards
Michael

0 Likes

#2

Hi @Shuya and a warm welcome to the KNIME forum! :slight_smile:

If I understood your issue well, I believe indeed that

:wink:

You need in a fist step to separate your column into two columns (String Manipulation or Cell Splitter node) and then apply a Rule-Based Row Filter/Splitter or a Rule Engine node, depending on the use case you have at hand… :wink:

Best,
Alec

1 Like

#3

Hi,

As far as I understood, the range for each unit is 10. So the possible sample number for the first unit is between 0 to 9, the possible sample number for the second unit is between 10 to 19 and so on. If right, then below is my suggestion:

I put the rows in bins (unit ranges) and then calculated the min and the max values for each bin. Then filtered these values and finally removed the max value of all bins except the 2nd one as it was mentioned in your list of desired values. You can change the filtering phase as you wish.
Please check the workflow and let me know if it’s what you want:
min-max units.knwf (32.2 KB)

Best,
Armin

1 Like

#4

Hi there Michael!

Glad you found lot of solutions already. Hopefully with help of this Forum :slight_smile:

Does this mean after 9 comes 20?

Hope you will find more solutions here :wink:

Br,
Ivan

1 Like

#5

Thanks folks for your fast answers. Was buisy the last days, will go through your suggesttions and report what did work for me!

0 Likes