Discrepancy in XGBoost Model Performance Between KNIME and Python

mustafa_alhamad · July 10, 2024, 8:34pm

Hello KNIME Community,

I am encountering a significant discrepancy in the performance of an XGBoost model when trained using KNIME versus Python. Despite using identical parameters, seed values, and the same data, the model’s accuracy in KNIME is 78%, whereas in Python, it is only 52%.

Here are the specific details:

Algorithm: XGBoost
Parameters:
Python’s parameters
eta= 0.3 ,
reg_lambda= 1,
reg_alpha= 1,
gamma= 0,
max_delta_step=0,
booster=‘gbtree’,
max_depth= 6,
min_child_weight= 1,
tree_method=‘auto’,
sketch_eps= 0.03,
scale_pos_weight=1,
grow_policy=‘depthwise’,
max_leaves=0,
max_bin=256,
subsample= 1,
colsample_bytree=1,
colsample_bylevel=1,
colsample_bynode=1,
objective= “multi:softmax”,
enable_categorical=True,
n_estimators= 250,
base_score=0.5,
random_state=100,
eval_metric= ‘logloss’,

KNIME’s:

I am curious if anyone else has experienced similar differences in results and if there might be any underlying reasons for this disparity. Any insights or suggestions for troubleshooting would be greatly appreciated.

Thank you!

ScottF · July 12, 2024, 11:50am

Hi @mustafa_alhamad and welcome to the forum.

Offhand it’s hard to speculate on the source of the discrepancy based solely on text/screenshots. Do you have a sample workflow you could upload, ideally one including a toy dataset with different branches for training the models - one featuring the XGBoost nodes, and another branch featuring your Python implementation? This would allow for a direct comparison.

system · October 10, 2024, 11:50am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.