# Weighting algorithms ensemble learning

Hello. I am really new at working with knime.
I have several questions.

1. How is it possible in knime to weight algorithms (Base Classifier) and combine them as a meta modeling approach?

2. Is it possible to build a stacking ensemble method in knime, instead of using the weka node for stacking?

3. Which loss function is implemented in the gradient boosted tree algorithm in knime and how different is this algorithm to the XGBOOST in R/Phyton?

S.E.

Hello S.E.,

welcome to KNIME!

First regarding your third question: The gradient boosted trees implementation in the KNIME AP is based on the paper "Greedy Function Approximation: A Gradient Boosting Machine" published 2001 by Jerome Friedman. More specifically we use the loss described in the section 4.4 M-Regression of the above mentioned paper for regression. The M-Regression approach is a hybrid of least squares and least absolute deviation allowing to specify the percentage of data that is considered to be inliers (can be set in the Advanced Options tab of the Gradient Boosted Trees Learner (Regression)). For classification we use the algorithm described in section 4.6 of Friedman's paper which essentially performs a K-class logistic gradient boosting (softmax). The differences to XGBoost are quite large because XGBoost uses more sophisticated objective functions that also include a regularization term. It also calculates the tree splits based on the objective function as opposed to least squares. In order to grasp the differences I would suggest to read the Introduction to Boosted Trees (https://xgboost.readthedocs.io/en/latest/model.html).

You can certainly build a weighted ensemble or stacked ensemble using the KNIME AP but doing so will require some workflow magic because (as far as I know) there are currently no nodes that will accomplish those tasks for you.

Cheers,

Cheers,

S.E.

Stacking is easily implemented in KNIME by combining the probability outputs of the stacked model Predictor nodes with e.g. a Logit regression Learner node. Each model in the stack as well as the combination model (e.g. logit) must use the same target class variable, for the rest everything's allowed.

1 Like

<double post>

I would like to add that the Predictor probabilities as input for the combination function's Learner node must come from predicting the trained set.

Once the stacked model is trained you'll also want to use supplemental Predictor nodes for each model in the stack to output the probabilities for the fresh data set as input for the combination function's Predictor node.

1 Like