When do I use stratified sample with the partitioning node?


Reviewing the knime example, I saw that in the “example for learning a decision tree”, it is used the option stratified sampling but I don’t know why is that?

If somebody know when is recommend to use this option, it would be really great!

This is a statistics question, not a KNIME-specific question.

You’d use stratified sampling if your original dataset can be divided into subpopulations and you want each subpopulation to be appropriately represented in your final partitioned dataset.


Thank you so much for your help.

1 Like


True but considering this is related to KNIME workflow example I find this question ok. Also we solved and addressed bunch of DB and other non-KNIME related problems and this is far more related to KNIME purpose IMHO :slight_smile:


This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.