hope you are fine. I have a question in the field of model selection for prediction task based on several influencing factors on the target variable:
I have a data set where i train, validate and test several models (Regression, ANN, trees…) to select finally the one with the best statistics. In the application now, i „see“ that my application-data is getting more away from the training data i took before to select the model. So my question is:
a) How to „monitor“ if application data is in range of training data? (is there a good “multidimensional” distance measure?)
b) Why to use which model? (depending on the distance between application and training data, which model to chose?)
Thanks a lot and best regards
Concerning machine learning and statistics I have compiled a selection of links. There is also a section about multi dimensional models. One statistic to use would be LogLoss
I am not sure I follow what you mean by distance. Typically if your data you try to use the model on does change in a significant way you would have to change/retrain the model.
You could monitor the scores of your model with the real results and define a threshold when the model is degrading. When that is would very much depend on your (business) case.
The question what model to use might depend on several factors. The statistic might be one important indicator but you might have to consider costs of misclassification in either direction and the ease with which your model can be brought into production.
thanks a lot for your feedback & the very interesting collection of links. Unfortunatly i have not found the answer so far… maybe to be more precise: I am working on a “what if analzyer” based on prediction models. Therefore the question is “how much” the application data can “vary” from the training data with still an “acceptable” result. Maybe its more a “sensitivity analysis” question… in order to identify how dependent the output is on a particular input value.
If you have any material on that, let me know Thanks a lot & best regards
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.