I am a complete Knime noobie, so I hope this isn´t completely dumb.
In my line of work (financial institution) I have a data source that has a monthly snapshot of many of our customer characteristics. In our personal loans portafolio analisis scenarios, is commonly true that past months customer behavior predicts future default rates pretty well.
I want to design a template that enables us to easily test different hypothesis regarding the customer's historic behavior. Example input table:
|Customer ID||Date||Days Past Due||Account Balance||Available to disburse|
This is a simplification of what we have, but it works as an example. This table would continue downward for 400k customers and about 5 years of snapshots.
Common features that we would like to create on the fly with this two example dimensions:
- Max Days Past Due in last 6 months (hopefully this would be some kind of lag calculation so that all the snapshots of the clients have this calc)
- Days past Due Delta between Row Date and 3 Months ago. ( For example customer 2222 would have 156-64 = 92)
- from 6,9, 12 months ago ( Account Balance/ (Account Balance + Available to Disburse)
All this features would be used in a classification model that predicts the probability of default N months after a point of observation.
I would really appreciate any help.
Thanks a lot!