Hi all, i am really new to big data and really wish can get some suggestions from community ,
i trying to analysis the SR (service request) made by customers, and predict when will be the next request.
my planning is that i compute the time difference between the SR, take it as dependent variable (Y), and figure out what are the independent variables (X) which will affect the Y.
i try to use Linear Correlation to find out the independent variables, after that i use Linear Regression and mining technique to predict the next SR.
the problem is that the ID is not unique, i try to compute the new table (table2) with unique ID with the average of time difference (table 2 , sum all the time difference and divide with Number SR), the SR type i compute using one2many node in knime.
Can anyone give me suggestion on the flow? how to get the independent variables ?
Currently i have a table look like below (table1) :
|ID||No SR||SR type||SR time||time_dif||area||pay_due|