RevoU Python Assignment

Project Summary

  1. Imported and joined dataset from the source files.
  2. Conducted data cleaning such as removing null values, removing outliers, removing irrelevant variables, and machine learning.
  3. Conducted exploratory data analysis on the dataset and convey the important findings
  4. Created user segmentation using k-means clustering

Insights

  1. Highest value customer is dominated by both of the promo sensitive customer and non promo sensitive.
  2. High Value Caustomer has the highest transaction amount in the past 6 months.
  3. Transaction count is being dominated from “High Activity” group.
  4. The lowest customer value is really sensitive to promotion.
  5. Lowest customer value has the highest amount of transaction and Based on previous graphs this could be the effect of the promotion program.

Project Files

For a more comprehensive analysis and visualization, please open the project files.

Project Background


Python is the most challenging yet also exciting data programming languages. In this assignment we practiced python skills such as data cleaning and exploratory data analysis using python code. Also, we practiced more advanced python skills such as user segmentation using cluster analysis. We used Google Collab as a python notebook tools.

Data Scope, Goals & Objectives

In this assignment we used data from kaggle. In this dataset thera are several information such as profile information, average transaction, promo transaction, & etc.

Goals

  • Identifying different segments, we can gain insights into the diverse needs, preferences, and behaviors of your customer base. This knowledge can help us develop a more effective and targeted rewards program, leading to increased customer satisfaction, loyalty, and overall business growth.
  • Evaluate effectiveness and success of the promotion programs, By conducting a comprehensive analysis of the promotion program’s performance, we can identify areas for improvement, and make informed decisions for future promotional initiatives.
  • Objectives

    1. Conducting data cleaning using various methods so the result would be more accurate.
    2. Exploratory data analysis of the dataset to find the problem within REVOU BANK.
    3. Creating user segmentation using cluster analysis to help targeted marketing.

    Data Analysis

    Note : only important steps shown to simplify the analysis explanation.

    Data Preparation & Cleaning

    Data Prep


    Python environment preparation by loading the necessary library.

    Import Dataset


    Imported dataset from google sheets using csv interpreter.

    Handling Data


    Removed irrelevant features from the dataset.


    Removed duplicate values from the dataset.

    Feature Format


    Changed the feature format to datetime, account id to str, homeowner status to int. This is necessary to analyze the data further.

    Exploratory Data Analysis

    Evaluative descriptive statistics


    Numerical feature desciption using describe function in python.

    Customer Demographic



    Promo-Sensitive by MAPP_Active_Group



    Highest value customer is dominated by both of the promo sensitive customer and non promo sensitive.

    Transaction Amount Customer by MAPP_Active_Group



    High Value Caustomer has the highest transaction amount in the past 6 months.

    User Segmentation

    Preparing the data for cluster analysis


    How many clusters? We used Elbow Method and Silhouette Analysis



    The turning point to determine the number of clusters between 3 or 4 requires further examination using the silhouette method.



    The chosen cluster is 3 because although cluster 2 has a high silhouette score, it doesn't provide sufficient insight for segmentation analysis.


    Creating cluster using K-Means


    The K-Means clustering used because the data have more numerical features than categorical features.

    1. The distribution of data on each cluster quite good (no cluster with small count).
    2. Cluster 0: has the highest Average Transaction Freq & Highest Revenue Generated.
    3. Cluster 1: is the highest average sales.
    4. Cluster 2: is the most being promo-sensitive client.

    Recommendation

    1. For Cluster 0 Investment and Wealth Management Service (Deposito).
    2. For Cluster 1 Offer Higher Credit Limits.
    3. For Cluster 2 Cashback and Reward Program.


    Home