Data preparation for Machine Learning with KNIME and the Python “vtreat” package

Before you can do fancy machine-learning it does make sense to prepare your data - like deal with missing values, remove highly correlated variables and so on. You can do a lot of this by hand or you could employ a ready made tool like vtreat which has been implemented in R and Python.

You can read more in my Medium stories. Also check out the examples mentioned there on the KNIME Hub:


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.