Finding min and max for a subset of rows

jcbaber · January 21, 2013, 5:09am

Hello,

I know similar things have been asked before and I believe I can get close to what I want by looping through every row and extracting a subset of rows based on date but there has to be a better way.

I have a database sorted based on time/date and with a key result column. I would like to calculate something similar to a stochastic oscillator where I determine where the value (V) in the result column for the current row (r) stands relative to the last n results (or results in the last n days).

Standing = ( V(r) - MIN(V(r)...V(r-n)) ) / ( MAX(V(r)...V(r-n))-MIN(V(r)...V(r-n)) )

I was hoping there might be something like the moving average node that would calculate the minimum and maximum for the last n rows but cannot find anything.

Does anyone have a suggestion of how to calculate this efficiently?

Cheers,

Christian

swebb · January 21, 2013, 11:35am

Would it be possible to use group by to report the min and max for each group?

jcbaber · January 21, 2013, 3:17pm

Thanks for the idea, I had not thought about using grouping.

However, I looked at the node and cannot see how I could use it to group and aggregate a range of rows/dates since they all have unique values so would not group. Even if I could calculate some kind of group column I need to max and min of the previous n rows for every row so the groups are overlapping and not themselves unique.

Did I miss something that can be done with grouping?

Cheers,

Christian

swebb · January 21, 2013, 4:58pm

I got the impression you wanted to analyse different time periods (dates).

If you wanted to do this based on a specific day you would be able to group based on a data column e.g. 07/01/2013 entries would group together. You could then use a grouping method to get the min and max of this group.

However it does not seem like this is how you wanted to group the data / how the data is represented so it won't work.

Without knowing more details I would likely generate a dictionary to define the start and end values for a group and do a table row to variable loop with a filter. But this will likely not be the most efficient method.

Regards,

Sam

aborg · January 21, 2013, 9:45pm

Hi Christian,

Maybe the following workflow help solve your problem.

jcbaber · January 22, 2013, 3:01am

Thanks. That is really helpful. I was worried that doing it this way would not be efficient (I have thousands of rows and want to do multiple lookbacks of 5, 10, 30, 100 and 200 days).

However, running your code it seems to run pretty quickly so I am working on moving the number of lookback rows into a variable and reaming columns based on the lookback so that I can collapse it can use it multiple times.

Thanks again.

Cheers,

Christian

jcbaber · January 22, 2013, 3:08am

Thanks Sam.

I wanted to do the calculation over a range of dates (the current row back to the date of the current row minus 5,10,30,100 and 200 days). So grouping on the date itself does not work but looping through and assigning groups before doing the grouping seems to work quickly enough and gives me access to all of the aggregation that I need.

I appreciate you taking the time to answer twice and hope I can return the favour some time.

Cheers,

Christian