Customer segmentation is crucial for targeted marketing, identifying the right customers to approach at the optimal time. Customer profiling considers demographics, location, lifestyle, values and needs. This article utilises transactional data from a retail business to segment customer through machine learning (ML) algorithms and enhances a company’s ability to target and serve customers effectively. This analysis is conducted on a real-world dataset from a convenience store, encompassing records of 3,000 customers and their 195,547 transactions over six months. Each transaction included detailed information such as purchase date and time, quantity, value, and the number of categories in the basket.
Methodology
K-means and hierarchical clustering are deployed for this study. Behavioural attributes are observed through transactional RFM (Recency, Frequency, Monetary) model. This unsupervised machine learning algorithm involves comprehensive data processing, including correlation analysis and outlier removal through standardization/scaling. Principal Component Analysis (PCA) technique helps in reducing complexity and identifying segments. K-means clustering is used for segmentation, preceded by dimensionality reduction to streamline data processing. Although the optimal number of clusters are three (identified through silhouette score), six segments are created to achieve granularity requirement while maintaining a reasonable silhouette score.
Data Preparation and Cleaning
Despite having no missing values, the dataset included negative values for basket quantity and spending, which were removed for better analysis. Correlation checks and standardization ensured the data was suitable for clustering.
RFM Model Implementation
The RFM model was built by calculating:
- Recency: Time since the last purchase.
- Frequency: Number of purchases.
- Monetary Value: Total spending over six months.
Clustering and Evaluation
K-means clustering is chosen for its robustness and efficiency, with the elbow method determining the initial three-cluster solution. However, six segments are finalized to meet the marketing granularity in customer segments. The silhouette score indicated acceptable clustering quality.
Key Findings
Six kinds of customers are clearly differentiated by this study: champions, loyalists, potential loyalists, promising, those needing special attention, and those at risk of churning.
- Champions: These are the best customers, with recent and frequent purchases of high-value items. They should be rewarded to maintain their loyalty and promote new products.
- Loyal Customers: Representing 24% of the customer base, they have high purchase frequency and significant monetary value. Engaging them through feedback and incentives can enhance their loyalty.
- Potential Loyalists: They form the largest segment, buying frequently in large quantities, but with slightly lower recency. Special incentives like loyalty cards can convert them into champions.
- Promising Customers: Although their purchase frequency is lower, they spend a considerable amount when they do shop. Personalized marketing can increase their visit frequency.
- At Risk of Churn: These customers have not shopped recently and have low purchase frequency and monetary value. Reminders and personalized offers can re-engage them.
- Needs Attention: This group has a moderate recency and purchase frequency, indicating they could become regular customers with the right incentives.
Recommendations
This study suggests group-based marketing for each segment:
- Champions: Rewarding and engaging to maintain loyalty and promote new products.
- Loyal Customers: Targeting high value upsell products and engaging through surveys and feedback.
- Potential Loyalists: Offering membership deals and discounts to convert them into loyalists.
- Promising Customers: Providing personalized offers and incentives to increase purchase frequency.
- At Risk of Churn: Sending reminders and improving customer experiences to prevent churn.
- Needs Attention: Offering limited-time promotions based on previous purchases to encourage repeat visits.
Conclusion
Customer segmentation uses vast amount of data to identify patterns and behaviours. It helps target customers based on their needs, preferences, and purchasing habits. Knowing customer behaviour helps in tailoring marketing and product offering that produce tangible results. Targeted marketing ensures right message for right audiences to improve efficiency and reduce costs. It also helps in high retention through identifying churners and deploying retention strategies. Making granular segments can help businesses gather targeted feedback for product development and enhancements. ML algorithms have potential to unearth emerging clients with new market opportunities. By addressing the unique needs of each segment, the retail/similar stores can improve sales, customer loyalty, and overall market presence.
About Author:
Nadeem Ahmed
Machine Learning and Big Data Expert based in London. He has been mentoring students across Pakistan.