Churn / Customer Lifetime Value Analytics


I have a data set with the following columns:
Invoice Amount
Invoice Number
Invoice Date

The dataset might look something like this:
CustomerID Invoice Amount Invoice Number Invoice Date
100100 $200 INV-001 Feb 1 2020
100100 $180 INV-002 Mar 2 2020
100200 $1,000 INV-003 Feb 1 2020
100200 $1,000 INV-004 Mar 1 2020
100200 $1,000 INV-005 Apr 1 2020

It’s a historical data set with many thousand rows and potentially dozens of invoices for each customer.

I’m trying to understand:

  • Total amount by customer
  • Average amount by customer
  • Days between first and last (important - not currently caring about time between 1st and 2nd invoice, but the first and last invoice by customer.

I’m using the following workflow

But I’m stuck in a continuous loop at the Row Filter stage.

Any help would be appreciated. Perhaps the template is too complex for what I’m after?

Hi @wisemanleo , this can be done via the GroupBy node and the Date&Time Difference.

I put something together that looks like this:

Input (Same as yours):


Here’s the workflow: Customer Lifetime Value Analytics.knwf (15.2 KB)


Thank you! This seems to work for the most part. My upper bound on the total amount, however, seems to be only around $4K, when I have individual invoices much higher than this. Going back to the String Manipulation node, I think what’s happening is that the numbers that are already in the Long format are getting truncated out (I changed from toInt to toLon).

I will keep diagnosing, but wanted to let you know in case you might know why this is happening.

Hi @wisemanleo , if you can share some sample data in a file, same file format that you are reading from, we can look into it.

CLVRaw20220211 R3-scramble.xls (45 KB)

Here’s a sample dataset. Generating random numbers inherently generates mostly unique/distinct numbers, but I went through and replicated some of the random numbers such that we can have more than 1 invoice per customer.

Hi @wisemanleo , thanks for sharing the file.

So it looks like the dates are already in date format, and the amount is already in integer, so you don’t need to do any manipulation for these 2 columns. Removing these 2 nodes, the new workflow using your Excel file looks like this:

And this is the results that I get:

There are 349 Unique Internal ID. A lot of them have only 1 Invoice entry, so the number of days between first and last invoice is 0 for these cases.

Here’s the workflow: Customer Lifetime Value Analytics.knwf (26.8 KB)

1 Like

Great, this seems to work! Thank you very much.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.