This workflow implements an alarm system for bycicle restocking at Washington bike stations. Dataset is the bike data set. Problem is to predict one of three classes, that is whether a bike station needs: to remove bikes, to add bikes, or no action. Predicting 3 classes is easier than predicting a precise number and classification methods can be used. Input features are various including the past ratios bike/parking spots for each station. An optimization of the input features (the minimum optimal training set) is also implemented. 1. Data is read and prepared to produce the three classes (Flag(-1)). This is the alert signal to predict. 2. Bike ratios are claculated as total # bikes/available spots by station 3. 10 past ratios in input vector 4. Backward Feature Elimination for best subset of input features against a decision tree. This minimal subset of input features is what makes the system "lean". 5. Retrain full model on optimal subset of input features 6. Save model
This is a companion discussion topic for the original entry at https://kni.me/w/L3RleVHmju5bN_Jk