Retaining Customers with Data

Carol Zhang written December 2017

Motivation

Retention: metric measuring the ability of a business to retain, or keep customers. It reflects both customer loyalty and the quality of the product.

It is a key metric that directly affects revenue. The good news is, retention is easy to adjust if it is measured the right way by harnessing existing data. Naturally, when my workplace decided to improve retention, I jumped on the opportunity to use data to help the company grow.

Abstract

To decide as a team where to focus retention efforts, I’ve built an in-house retention app that combines charts and SQL queries into a presentable, easy-to-use web page accessible to everyone. The retention dashboard allows users to organize and filter customer cohorts, calculate retention statistics associated with those groups, and export their contact information for outreach efforts. Armed with actionable statistics, we started running marketing campaigns focused on retention and have seen an improvement in both profits and customer happiness.

Methods

Data warehousing is key to building sustainable infrastructure generating custom retention insights. Existing data from 10+ tables are integrated into three categories:

Customer demographics - data provided by the customers upon sign-up including age, gender that they most identify with, and length of membership.
Purchase information - information generated from each transaction. We can add more granularity to creating cohorts. This helps us track the history of each customer and lays the foundation for further marketing metrics (see Future Direction).
Product details - from all the stuff purchased. Pretty straightforward.

These data-funnelling and warehousing efforts are for both sanity and performance reasons. Once populated, the warehoused data is updated by:

First checking the last inserted ID. For example, for the user module, I pulled the last user ID.
Since the user tables used in production contain the same key (user ID), I can use the most recent user ID from to delimit the starting point of rows to pull. I fetch only the rows that are more recent than the warehoused data and insert them.

This avoids overwriting existing data that have been cleaned and pre-processed and reduces volatility. The entire process is done via an automated script that runs monthly.

Results & Future Direction

The first iteration has been very well received by everyone in the company. Various marketing campaigns stemming from the data generated have been successful in boosting profits and customer satisfaction. Who would've thought people like it when a company cares about them?

Because customer cohorts can be easily defined in the retention app, the next phase is adding RFM calculations to the the custom groups defined to paint a clearer picture of each person.

Thoughts? Comments? Job opportunities? Get in touch!