Inquiry about Outlier Removal node

hhkim · November 4, 2025, 1:29am

Hello.

I was wondering what the “Group measurments by” option of Outlier Removal node means.

I would really appreciate an explanation with a simple example.

Best regards,

hhkim

ActionAndi · November 4, 2025, 5:30am

Hi,

outliers are in this node identified if they are more than x standard daviations away from the mean (option: Mean ±DS) or x times outside the interquantile distance (option: Boxplot).

The group setting is used to do the math for each member of a group to remove intra-group outliers.

Is there a certain thing why you use this node? I personally work with the standard “numeric outliers” node whcih has more options regarding the outlier handling.

hhkim · November 4, 2025, 5:36am

Thank you for the reply. I have two follow-up questions:

Does this mean outliers are removed within each group defined by the columns selected under the “Group measurements by” option—i.e., similar to applying a GROUP BY and then removing outliers per group?
In the Numeric Outlier node, how can I configure it to remove outliers based on standard deviation rather than IQR?

Best regards,

hhkim

ActionAndi · November 4, 2025, 7:20am

I’m not sure but I would say so. Maybe you can create a test dataset to find it out.
You’re right, I haven’t notived that the z-score estimation is not part of the node. I usually calculate these scores manually with the “math formula” node:

z_score = ($Col - col_mean($Col))/col_stddev($Col)

and look at the z_scores if they are distributed well.

In the case of z_score filtering you must keep in mind, that this assumes that your dataset is distributed normally. Especcially with a small number of values (per group) this can be challenging.