Spark Label Encoding - prepare the data in local Big Data environment

s_401 - prepare label encoding with spark prepare the preparation of data in a big data environment - label encode string variables - transform numbers into Double format (Spark ML likes that) - remove highly correlated data - remove NaN variables - remove continous variables - optional: normalize the data


This is a companion discussion topic for the original entry at https://kni.me/w/mF4g6HTMX7J4m27Q