Consumer Insight Analysis

Synopsis

This project explores the E-commerce Customer Behaviour Dataset provided by Kaggle . This dataset provides a comprehensive view of customer behaviour within an e-commerce platform. Each entry in the dataset corresponds to a unique customer, offering a detailed breakdown of their interactions and transactions. The information is crafted to facilitate a nuanced analysis of customer preferences, engagement patterns, and satisfaction levels, aiding businesses in making data-driven decisions to enhance the customer experience.

Note: This dataset was synthetically generated for illustrative purposes, and any resemblance to real individuals or scenarios is coincidental.

The transformed dataset, raw SPSS Output file, and a Results and Discussions Word document written in APA style, which goes in depth and is more aimed at a technical audience, are presented in my GitHub

This website contains the Visualisations of the results and a quick and easily digestible summary for a non-technical audience.

Age Category

Summary

63.7% of the consumers were aged 30-39, which constituted the majority, while 19.7% were aged 20-29, and 16.6% were aged 40-49.

Gender Visualization

Gender

Summary

The gender distribution was similar, 50% male and 50% female consumers.

Membership Type Visualization

Membership Type

Summary

The customers were evenly split among Bronze (33.1%), Silver (33.4%), and Gold (33.4%).

Satisfaction Level Visualization

Satisfaction Level

Summary

Satisfaction levels were distributed as follows: 36.3% of customers reported being satisfied, 30.6% were neutral, and 33.1% were unsatisfied. Overall, the customers were satisfied; however, due to the high increase in dissatisfaction, careful consideration needs to be given to addressing this issue.

K-Means Cluster

K-Means Cluster

Summary

The K-Means Clustering visualisation identifies distinct customer segments based on their total spend, age, and the number of items they purchase. Each cluster represents customers with similar purchasing behaviours.

Cluster 1 is characterised by the older customers with an average age of 35, with the lowest number of items purchased (10), and the lowest amount of money spent, $611.5. While cluster 2 is characterized by younger customers with an average age of 30, with the highest number of items purchased (18), and the highest amount of money spent ($1311.1).

Gender by Membership Type

Chi-Square Test for Independence (Gender by Membership Type)

Summary

From the crosstabulation, it was found that males were highly overrepresented in the silver membership category (66.3%) and completely absent from the bronze category. Conversely, females were significantly overrepresented in the bronze category (66.3%) and almost absent from the silver category (0.6%).

In the gold membership category, both males and females had a roughly equal distribution at 33.7% and 33.1% respectively.

Total Spend by Age Category

Analysis of Variance (Total Spend by Age Category)

Summary

There was a statistically significant difference between the different age groups. Customers aged 20-29 spent significantly more ($1104.4) than those aged 30–39 ($830.5), while those aged 40-49 spent the least at $597.6.

Thus, we can conclude that younger customers tend to spend more than older customers.

Satisfaction Level by Membership Type

Chi-Square Test for Independence (Satisfaction Level by Membership Type)

Summary

The chart above illustrates the relationship between customer satisfaction levels and membership types. Gold members tend to report higher satisfaction levels (92.1%), suggesting that premium services or perks associated with Gold membership may positively impact customer experience.

Silver had the second-highest level of satisfaction at 6.3% but a very high level of dissatisfaction at 50%. Finally, Bronze members had the lowest satisfaction level at 1.6% and a similar percentage of dissatisfaction with Silver members. This insight can inform future strategies for enhancing experiences for entry-level members or promoting upgrades to higher membership tiers.

Total Spend by Discount Applied

Independent Samples T-Test (Total Spend by Discount Applied)

Summary

The box and whisker plot above compares total customer spending based on whether a discount was applied. It helps assess whether discount strategies are leading to higher revenue or simply attracting lower-spending customers.

From this visualisation, we can observe that there was no significant difference in total spending for customers regardless of whether they got a discount or not. Since total spend is similar across both groups, it may be worth re-evaluating how discounts are used.

Correlation: Days Since Last Purchase vs Membership Type

Spearman's Rank Correlation Between Days Since Last Purchase and Membership Type

Summary

This visualisation explores the relationship between the number of days since a customer's last purchase and their membership type (Bronze, Silver, or Gold). The correlation helps to identify which member segments are more engaged or more likely to churn.

Gold members consistently showed fewer days since last purchase, while Bronze and Silver members are more likely to have large gaps in the number of days between purchases. It is clear that Gold members, due to their high recency rate, represent a highly active segment worth prioritising in loyalty efforts. Though other membership tiers should also be given priorities, and to address their low recency rate.

Note: Correlation does not equal causation, and there could be factor(s) influencing their recency rate. One possible explanation could be that Gold members, being the highest tier, have special perks which enable them to make frequent purchases, while other membership tiers (Bronze and Silver) could have lower perks compared to Gold members or none.