Hyperparameter nomenclature

Good evening,

I have some doubts regarding some terms used in KNIME and those used in some books for hyperparameters:

  • In logistic regression with Laplace (LASSO): the “Variance” for regularization in Knime is equivalent to “lambda”, the penalization for the coefficients?

  • In the Random Forest learner: which value defines “mtry”, the number of randomly selected variables used at each split?

  • In SVM: the “Overlapping penalty” in Knime is equivalent to the “Cost” parameter (the penalty applied for wrongly classified observations)?

Thank you,
Marc

1 Like

The “Overlapping parameter” in Knime is indeed the Cost hyperparameter in SVM. Any suggestions regarding the value defining “mtry” in RF and the “lambda” parameter in LR with LASSO?

Thanks,
Marc

Hello Marc,

Logistic Regression: KNIME uses a Bayesian formulation of the problem where you pick the prior distribution of the weights i.e. the distribution you expect the weights to be generated by. We implicitly set the mean of this distribution to 0 and you can control the variance via the variance parameter. This means that the larger the variance, the larger the weights are allowed to be. Therefore a larger variance corresponds to less regularization. The alternative formulation using Lambda works the other way around i.e. the larger the lambda the stronger the regularization. For more details on logistic regression and regularization, you can check out this blog post by Kathrin: https://www.knime.com/blog/regularization-for-logistic-regression-l1-l2-gauss-or-laplace

Random Forest: The Random Forest nodes set the mtry parameter to the square root of the number of features as it is the default suggested in the literature. If you want to have more control, you can always switch to the Tree Ensemble Learner node that will give you a lot more options to tweak. Among these, you will find the Attribute Sampling (Columns) option where you can select how many columns should be used for each split. Those options include all columns, the square root, a linear fraction or an absolute number of columns.

2 Likes

Thank you very much, @nemad!

1 Like