I have a table with 69 columns and 58,000 rows. I would like to calculate the median of each row but I can’t find a way to do this without first using transpose to get my columns into rows that I can use GroupBy to calculate medians of. Is there any other option - preferably one that won’t require me to transpose first (only to then re-transpose after)?
Hello @lparsons42 -
You could use the median() function in the Math Formula node. Does that work for your case?
I’m not clear on the syntax for the median() function in Math Formula; how do it set it to calculate the median of a row?
In my case, all the columns who I want to be figured into the calculation for median have “intensity” in their name. I tried
median(“intensity”)
and it returned a syntax error. (i had asterisks on either side of intensity, inboard of the quotation)
You’ll have to enter the name of each column explicitly, as I don’t believe wildcards are supported. Here’s an example for calculating the median of four columns:
It will be quite a large expression in your case, but should work. It might be easier to cobble it together in a text editor like Notepad++ and paste it in.
Is there any way that I could set this up for a future case where I won’t necessarily know the number of columns beforehand? I tried to get it into a flow variable list, but that doesn’t seem to work as it still wants (at least) three comma-separated values even if one is a list of values. I tried then following the list with “NaN” (twice) and that ran but returned all invalid medians.
If you’d like to post your workflow that uses a flow variable list, I can take a look.
Thank you for your help! I eventually realized this was easier (for me) to do with an R snippet.
I think Column Aggregator could be useful as well.