Example Workflow for ETL Basics Operations

On sales2008-2011.csv data set a number of ETL operations are performed. Besides showing what ETL features are, the goal of this workflow is to move from a series of contracts with different customers in different countries to a one-row summary description for each one of the customers. The one-row description includes: the customer unique ID; the total amount of money payed by the customer to the company; the countries the customer has been active in; the date of the first contract (this is always useful to estimate the customer loyalty); and the number of days between the first and the last purchase, that is the number of days the customer has been with the company. At the end, each one-row customer summary information will be joined together with each contract data row from the original file and write the resulting table to a CSV file in a "data" folder located in the workflow folder.


This is a companion discussion topic for the original entry at https://kni.me/w/HP5c_SGAFyNei-ZC