LSTM with time series prediction

Pdlsandesh · December 15, 2023, 6:23am

I was initially using the random forest for my churn prediction project. However, I have now decided to switch to LSTM, as my problem exhibits a time series nature with some cyclic patterns. Unfortunately, after implementing a simple sequential model with two layers of LSTM, I obtained much worse results compared to the random forest. Does anyone know the cause of this behavior?

Additionally, I have structured the input dataset on a time basis in a 3D format, where each entry represents the number of rows, the number of months, and the features for each month.

ScottF · December 19, 2023, 9:49pm

Hi @Pdlsandesh and welcome to the forum.

Can you post your workflow in progress so far, along with your data and your results? Help us help you.

Pdlsandesh · December 20, 2023, 5:40am

Thank you, @ScottF, for the reply.

Since I can’t directly send the data as per corporate standards, I can provide you with an overview of my progress:

In the past, I built a random forest model that performed well, achieving a precision of 68% and capturing approximately 60% of the total base (recall).

Subsequently, I decided to transition to an LSTM model. During this transition, I modified some features and organized them into a 3D array with dimensions (number of samples, number of months, features per each month). Despite these changes, the overall sensitivity to the features persisted in the LSTM model.

I then trained the LSTM model with two LSTM layers, each followed by a simple dropout layer and a dense layer at the end.

The results with LSTM showed a precision of around 63%, but the recall dropped to approximately 23%. I attempted to enhance the recall by adjusting probabilities, but as expected, precision deteriorated even further.

I am interested in learning if anyone has utilized LSTM for binary classification, particularly in churn analysis within industries such as telecom. I am also curious about the comparative results with the traditional Random Forest model, which is recognized for its effectiveness in binary classification.
My dataset consists of 10 million records, with 20 features per month, and I am utilizing data from the past 6 months.

Daniel_Weikert · December 20, 2023, 4:28pm

Have you tried a CNN as well?
br

system · March 19, 2024, 4:28pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.