Hello,
I have 3 features with 2 associated features:
GR1_COUNT Number of purchases per day merchandise group1
GR1_FIRST_TIME Time of the first purchase on the day
GR1_LAST_TIME Time of the last purchase of the day
GR2_COUNT Number of purchases per day merchandise group1
GR2_FIRST_TIME Time of the first purchase on the day
GR2_LAST_TIME Time of the last purchase of the day
GR3_COUNT Number of purchases per day merchandise group1
GR3_FIRST_TIME Time of the first purchase on the day
GR3_LAST_TIME Time of the last purchase of the day
The time data is only available if GRx_COUNT> 0.
I do k-means clustering. Currently the missing time data are replaced by 0. For the analysis it is bad, because the 0 values distort teh clustering and the min, mean … values as well.
Does anyone have an idea how to deal with such conditional data ?
Thanks in advance.
Warm regards,
Michel