Hi,
Is there a way to weight cases based upon the values of a column?
Thanks,
Kees
Hi,
Is there a way to weight cases based upon the values of a column?
Thanks,
Kees
Hi Kees,
in which scenario do you want to weight? In a learning algorithm?
Best, Iris
Hi Iris,
I would use it for instance for making a dataset better resemble the population it was drawn from.
Sometimes a dataset comes with an attribute for such use.
Compare with SPSS: 'weight cases by variable'.
Thanks,
Kees
Sounds like sampling weights. The equivalent would be to replicate the observations according to the weight column.
For example, a weight of 2 would be the same as having the observation twice in the dataset. This could be achieved with One Row to Many, or Concatenate (together with a loop), or Cross-joiner.
Obviously this requires integer weights. In case of float weights, simply multiply the weights by e.g. 10 / 100 / 1000 / etc. so that they become integer.
Thanks for your feedback Geo.
Adding cases is not a sufficient solution for me.
The weight is not necessarily an integer, plus, I prefer my n not to augment.
Perhaps R or Python can help.
Thanks, Kees
Hi, I have the same problem, I want to add weighting to a model but would rather not add rows. I read a subsequent post that mentioned an R snippet, I don’t know R so wondered if a more user friendly alternative had been created in the 2 years following that post?
Hi,
Any updates to this thread? Is there now an option to weight data points in the loss functions used by Knime?
Hi @MattiasMarder,
maybe check you this topic about weights: misclassification costs to adjust for rare outcome - #6 by goodvirus
Best regards,
Paul