3.1 Customer Data for A Clothing Company

Our first data set represents customers of a clothing company who sells products in physical stores and online. This data is typical of what one might get from a company’s marketing data base (the data base will have more data than the one we show here). This data includes 1000 customers:

  1. Demography
    • age: age of the respondent
    • gender: male/female
    • house: 0/1 variable indicating if the customer owns a house or not
  2. Sales in the past year
    • store_exp: expense in store
    • online_exp: expense online
    • store_trans: times of store purchase
    • online_trans: times of online purchase
  3. Survey on product preference

It is common for companies to survey their customers and draw insights to guide future marketing activities. The survey is as below:

How strongly do you agree or disagree with the following statements:

  1. Strong disagree
  2. Disagree
  3. Neither agree nor disagree
  4. Agree
  5. Strongly agree
  • Q1. I like to buy clothes from different brands
  • Q2. I buy almost all my clothes from some of my favorite brands
  • Q3. I like to buy premium brands
  • Q4. Quality is the most important factor in my purchasing decision
  • Q5. Style is the most important factor in my purchasing decision
  • Q6. I prefer to buy clothes in store
  • Q7. I prefer to buy clothes online
  • Q8. Price is important
  • Q9. I like to try different styles
  • Q10. I like to make decision myself and don’t need too much of others’ suggestions

There are 4 segments of customers:

  1. Price
  2. Conspicuous
  3. Quality
  4. Style

Let’s check it:

## 'data.frame':    1000 obs. of  19 variables:
##  $ age         : int  57 63 59 60 51 59 57 57 ...
##  $ gender      : Factor w/ 2 levels "Female","Male": 1 1 2 2 2 2 2 2 ...
##  $ income      : num  120963 122008 114202 113616 ...
##  $ house       : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 ...
##  $ store_exp   : num  529 478 491 348 ...
##  $ online_exp  : num  304 110 279 142 ...
##  $ store_trans : int  2 4 7 10 4 4 5 11 ...
##  $ online_trans: int  2 2 2 2 4 5 3 5 ...
##  $ Q1          : int  4 4 5 5 4 4 4 5 ...
##  $ Q2          : int  2 1 2 2 1 2 1 2 ...
##  $ Q3          : int  1 1 1 1 1 1 1 1 ...
##  $ Q4          : int  2 2 2 3 3 2 2 3 ...
##  $ Q5          : int  1 1 1 1 1 1 1 1 ...
##  $ Q6          : int  4 4 4 4 4 4 4 4 ...
##  $ Q7          : int  1 1 1 1 1 1 1 1 ...
##  $ Q8          : int  4 4 4 4 4 4 4 4 ...
##  $ Q9          : int  2 1 1 2 2 1 1 2 ...
##  $ Q10         : int  4 4 4 4 4 4 4 4 ...
##  $ segment     : Factor w/ 4 levels "Conspicuous",..: 2 2 2 2 2 2 2 2 ...

Refer to Appendix for the simulation code.