Creat a component for step and corresponding key column for analysis

Hi Dear KNIME Powerful users!

I want to creat a work flow for below function, 


my data has 4 steps (step1,2,3,4). And column 0-800nm wavelenth,for each step and each wavelength it records by time interval as you can see in the picture time. For step 1 i want to filter out wavelenth @ 251,277,488 nm, for step 2 i want to filter out wave lenth @290,457,676nm , for step3 i want to filter out wavelength@ 111,333,444,566 nm.
How can i do that? After that i wish the output table is combined one by one in the columns and caculate each step mean.

Hi @Turkeyhazel,

i would try to solve with following steps

a) unpivoting your data
b) step 1-4: apply a Filter node like Rule-based Row Filter, Reference Row Filter or Java Snippet Row Filter
c) concatenate your filtered data
d) using e.g. column expression node to create the string for your column header based on the data
e) pivot the data with required aggregation

BR

1 Like

Hi Morpheus, Thanks for your reply, do you have experience, maybe a demo for the component part for this function, just the filter function and concate them into one table is okay.
KNIME_Community Questions.knwf (18.4 KB)
i add a table creator for example, maybe can make the question more clear.

Can you help with the rule-base filter expression? I can write python code but not familar with Java, so appreciate if you can help, thanks in advance

Just a small example,

BR

Forgot to mention,
depend on your further dataprocessing you could apply all filter within one node and do not need to concatenate your data.

BR

Hi @morpheus , for the Rules in your screenshot, since all 3 have the common conditions of $Step$ = 1, it’s more efficient to actually write the rules as 1 rule like this:
$Step$ = 1 AND $Columnnames$ IN ("251.0nm", "277.0nm", "488.0nm") => TRUE

It is more efficient because Knime would have 1 rule to validate as opposed to 3 rules. It has to evaluate $Step$ = 1 3 times as opposed to once.

Also, since IN statements act as OR (basically the IN statement can be written as $Columnnames$ = "251.0nm" OR $Columnnames$ = "277.0nm" OR $Columnnames$ = "488.0nm"), it’s essentially a “short-circuit” operator, in that if any of the conditions is met, it does not have to continue evaluating all the other conditions. For example, if $Columnnames$ = "251.0nm", it does not need to evaluate if $Columnnames$ = "277.0nm" OR $Columnnames$ = "488.0nm", whereas in your version, it will still go and evaluate the other 2 conditions.

2 Likes

I am always interested in performance so when your say more efficient is this related to writing the statement or real performance considering time?
Thanks

Hi @Daniel_Weikert , the performance I was referring to is about running time, not about writing the statement :slight_smile:

I kind of explained what is the performance difference in my post :slight_smile:

Thanks,
to me it is all about numbers so I am interested in “1min vs 5 seconds” to get a feeling for the difference. I am still looking for a comprehensive performance guide in KNIME. Hope there will be some kind of documentation in the future.
br and take care

Could you pls help with the unpivoting knode? how can i unpivoting my table as yours , i mean reverse the wavelength 200-800nm columns as yours shows column names in a column

You should keep your key columns (e.g. ID, TIME, STEP, SEQUENCE,…) as remaining comlumns. In consequence you will get for each value and wavelength one record.
Based on that resulting table you can do any aggregation depending on the grouping keys you decide.

BR

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.