Data Reduction for Bootstrapped Samples

Hello everyone,

If you’ve worked with bootstrapping (sampling with replacement), you know it’s a powerful technique for estimating confidence intervals from small samples. However, it often produces datasets with many duplicates, which can distort the results by making the confidence intervals appear overly precise.

To address this, I created the Data Reduction component — designed to reduce bootstrapped samples while preserving their distribution.

  • Port 1: Duplicate-aware, bin-based reduction
  • Port 2: KS test results for Port 1
  • Port 3: Random row sampling (baseline)
  • Port 4: KS test results for Port 3

It helps compare structured reduction vs. simple sampling, both visually and statistically.

Hope you find it useful!
Happy KNIMing,
Carlos

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.