hi every one.....
i have asked to preform new method of sampleing according to below:
The over sampling, under sampling and imbalanced sampling have been ignored. Balanced
stratified sampling is selecting the samples from m strata, but of equal size. If there are minority
or majority classes then over-sampling or under-sampling respectively need to be performed as
required based on the distribution of the classes. In this paper the first option of balancing is
through ignoring the minority classes for the ratios above 1:100. As the response variable is of
multivariate in nature, ignoring the minority class leads to minimal error. The second method
adopted in this paper is to take equal size of samples from each strata by reducing the stratum to
p<m, such that p changing as the size of the sub-sample changes gradually from 500 to 30000.
The balancing criteria has been maintained in an excel file with the required size (i.e from 500–
30000) and that file has been given as input for the already built-in stratified model. As the
survival attribute was in binary form, it was easy to pick the sample representatives. The second
option of equiv-width has been used. For minority multivariate classes like stage and metastasis
the ratio of 1:100 has been verified using excel filter and this was given as an input to the
rapidminer for classification. This method is a fixed allocation of balanced classes in the literature
of probability and statistics.
i have two problem: first honestly, i don not get the idea (probably bcoz of waekness in my english) and second how can i do this kind of sampleing in knime?