Hello everyone,
If you’ve worked with bootstrapping (sampling with replacement), you know it’s a powerful technique for estimating confidence intervals from small samples. However, it often produces datasets with many duplicates, which can distort the results by making the confidence intervals appear overly precise.
To address this, I created the Data Reduction component — designed to reduce bootstrapped samples while preserving their distribution.
- Port 1: Duplicate-aware, bin-based reduction
- Port 2: KS test results for Port 1
- Port 3: Random row sampling (baseline)
- Port 4: KS test results for Port 3
It helps compare structured reduction vs. simple sampling, both visually and statistically.
Hope you find it useful!
Happy KNIMing,
Carlos