Predicting Customer Lifetime Value - I have the approach but not the KNIME node experience

#1

Hi Community,

I’m fairly new to KNIME but have quite extensive experience with predictive analysis. I have a use case and data for a Customer Lifetime Value prediction model but not the KNIME node experience to put this together. I am planning on using SVM in regression mode to do this but please let me know if there are other suitable methods instead. The output of the model should be the predicted spend of each customer from the point of modelling until they stop purchasing.

My desired input data to the model would look like this:


Customer ID - The unique ID of each customer

Recency Score - My data set spans over 12 months and I would like each customer to be given a score from 1 to 12 based on the month of their latest purchase

Frequency - The numeric number of orders placed by each customer over the last twelve months

Spend Q1 - The total numeric sum of spend per customer in the first quarter of the past year

Spend Q2 - The total numeric sum of spend per customer in the second quarter of the past year

Spend Q3 - The total numeric sum of spend per customer in the third quarter of the past year


Data prep questions:

  1. I have used the GroupBy node to get the latest order date for each customer but it returns this in a format of for example “2018-04-02T10:32”. How can I transform this to the number of the month (in this case 4)?

  2. I have transactional spend on an order level for each customer, for example:
    CustomerID OrderNo Date Spend
    123 345 2018-01-12 £54.65
    123 478 2018-04-24 £32.21
    123 678 2018-11-15 £75.32

What is the best way to calculate the spend per quarter for each customer and insert the sum of this in to the above mentioned “Spend Q” columns?

Modelling questions:

  1. I couldn’t find any specific SVM regression nodes but found the more generic “SVM Learner” and “SVM Predictor” nodes. Will these work for the purpose of my modelling?

  2. Do I still need to use a “Partitioning” node before the SVM and how should this be configured?

  3. What is the best way of evaluating the results from a Correlation Coefficient and Root Relative Squared Error perspective?

Input data set - I’m using this sample data to set my model:
https://archive.ics.uci.edu/ml/datasets/online+retail

I really appreciate your help with this and thanks in advance.

1 Like

#2

Hi @cason,

is this question still relevant? Have you figured out a way to implement a solution to your problem in KNIME? If now, what were still the open questions?

Best,
Mischa.

1 Like

#3

I am planning for a similar analysis, good to know some examples how the Knime nodes should be arranged.

0 Likes

#4

Hi,

I would suggest that you search for examples on hub.knime.com. An example of full DS sequence is


Alternatively, you can search example to extract date&time fields, etc.
Once you will have a specific use-case, you can always find help with specific questions on this forum.

Best regards,
Mischa.

1 Like