Calculate the median of a row without transposing first?


#1

I have a table with 69 columns and 58,000 rows. I would like to calculate the median of each row but I can’t find a way to do this without first using transpose to get my columns into rows that I can use GroupBy to calculate medians of. Is there any other option - preferably one that won’t require me to transpose first (only to then re-transpose after)?


#2

Hello @lparsons42 -

You could use the median() function in the Math Formula node. Does that work for your case?


#3

I’m not clear on the syntax for the median() function in Math Formula; how do it set it to calculate the median of a row?


#4

In my case, all the columns who I want to be figured into the calculation for median have “intensity” in their name. I tried
median(“intensity”)
and it returned a syntax error. (i had asterisks on either side of intensity, inboard of the quotation)


#5

You’ll have to enter the name of each column explicitly, as I don’t believe wildcards are supported. Here’s an example for calculating the median of four columns:

It will be quite a large expression in your case, but should work. It might be easier to cobble it together in a text editor like Notepad++ and paste it in.


#6

Is there any way that I could set this up for a future case where I won’t necessarily know the number of columns beforehand? I tried to get it into a flow variable list, but that doesn’t seem to work as it still wants (at least) three comma-separated values even if one is a list of values. I tried then following the list with “NaN” (twice) and that ran but returned all invalid medians.


#7

If you’d like to post your workflow that uses a flow variable list, I can take a look.


#8

Thank you for your help! I eventually realized this was easier (for me) to do with an R snippet.


#9

I think Column Aggregator could be useful as well.