Outlier Detection in Medical Claims

This workflow identifies outliers in medical claim data such as claims with an unusual high cost for a certain disease. Firstly, the input data is group by the target column (disease). Secondly, the interquartile range (IQR), i.e. the difference between the 3rd and 1st quartile, is computed for the numerical column in question (cost). Outliers are all records that do not lie inside the permitted interval IQR +/- k*IQR, where the factor k is specified by the analyst. The target and numerical columns can be defined in the configuration dialogs of the components. The lower branch of the workflow is a refinement of this approach and allows for identifying outliers across several target columns e.g. an unusual high/low duration of days staid for a certain disease and payment amount.


This is a companion discussion topic for the original entry at https://kni.me/w/hHKROQvCbMXBrTMB