Updated 2024-01-06
Customer Intelligence with R
Customer Activation, Development, Retention, and Segmentation (CADRS)
Introduction
Purpose
‘Customer Intelligence with R’ (CI with R) is for learning the basic application of customer activation, development, retention, and segmentation (CADRS) with R. It is aimed to be educational outside of the academia. In general, the topics broadly fall under
- Business intelligence
- Customer intelligence
- Business analytics
- Customer value maximisation
- Marketing mix modelling
- Market research
- Database marketing
for the purpose of commercial success and optimisation using customer transaction data.
Starting with CADRS insights and labelling, the learning format is generally broken down into few parts:
Coding demonstration
Output preview of the code
Links (usually Wikipedia) for the curious
Quizzes that challenge you to expand on the basics
On the R side, we will mainly be focussing on tidytable
and tidymodels
libraries, with an example open source data available online.
What is customer transaction data?
To put it simply, when you go shopping and you get your receipt, that is customer transaction data.
In the context of this book, this book utilises such data from the perspective of the vendor, where all the receipts are recorded for each of the shopping members. In the data, this means that the most basic form of such data will have (customer or membership) ID, date, and product ID columns. Other columns may include price per unit, quantity purchased (commonly negative if refunded), quantity unit (e.g. litres, units, metre_cubed, etc.), tax, material costs, bundle ID (e.g. for crates of bottles, promotion bundle, etc.), region, etc.
Here is an example preview of such data:
customer_id | date | product_id | net_euros |
---|---|---|---|
cmg94 | 1994-05-02 | m3vc90 | 12,04 |
gjo532 | 2010-11-27 | 3465u098 | 72,87 |
hfh5 | 2003-06-07 | gvm49 | 4,72 |
where net_euros
would be something along the lines of price × quantity.
Who is this book for?
‘CI with R’ demonstrates some basic, applicable, and deliverable CADRS examples of how customer transaction data can be utilised for business value.
As for prerequisites, ‘CI with R’ book is for R users that have at least few months experience that do not require explanations of the tidytable
(or dplyr
+ or tidyverse
) functions.
For those looking to get to that level, I would recommend mastering the R for Data Science 2nd edition book, and watch a bit of how Hadley Wickham codes (though the video is a bit dated now, especially the old pipes).
Further, knowing the basics of tidymodels
will certainly help; this library is rather recent, so if you are looking for some tidymodels
in action, Julia Silge’s YouTube videos and /r/tidymodels would be recommended.
In general, ‘CI with R’ will use the libraries tidytable
, tidymodels
, lubridate
, and stringr
, with conflicted
to override some functions. You can load the libraries with the following code:
suppressPackageStartupMessages(suppressWarnings({
library(tidytable)
library(tidymodels)
library(stringr)
library(lubridate)
# libraries for specific sections
library(tidyclust)
library(plotly)
::conflict_prefer_all(winner = "tidytable",
conflictedquiet = TRUE)
}))
Loading individual libs from tidymodels
is better for production
For installing R libraries, pak
is recommended over install.packages()
(except when you install pak
).
For those who want to learn a few example models, the models you will learn ‘CI with R’ book are:
Acknowledgements
This is essentially a compiled list of resources used. This book would not be around today without the years of resourcefulness from
Licence
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The code in this book is public domain, licensed under Creative Commons CC0 1.0 Universal (CC0 1.0).