Ecommerce Purchases Exercise

In this Exercise you will be given some Fake Data about some purchases done through Amazon! Just go ahead and follow the directions and try your best to answer the questions and complete the tasks. Feel free to reference the solutions. Most of the tasks can be solved in different ways. For the most part, the questions get progressively harder.

Please excuse anything that doesn't make "Real-World" sense in the dataframe, all the data is fake and made-up.

Also note that all of these questions can be answered with one line of code.

Import pandas and read in the Ecommerce Purchases csv file and set it to a DataFrame called ecom.

In [1]:
import pandas as pd
ecom = pd.read_csv('Ecommerce Purchases')

Check the head of the DataFrame.

In [2]:

Address Lot AM or PM Browser Info Company Credit Card CC Exp Date CC Security Code CC Provider Email Job IP Address Language Purchase Price
0 16629 Pace Camp Apt. 448\nAlexisborough, NE 77... 46 in PM Opera/9.56.(X11; Linux x86_64; sl-SI) Presto/2... Martinez-Herman 6011929061123406 02/20 900 JCB 16 digit Scientist, product/process development el 98.14
1 9374 Jasmine Spurs Suite 508\nSouth John, TN 8... 28 rn PM Opera/8.93.(Windows 98; Win 9x 4.90; en-US) Pr... Fletcher, Richards and Whitaker 3337758169645356 11/18 561 Mastercard Drilling engineer fr 70.73
2 Unit 0065 Box 5052\nDPO AP 27450 94 vE PM Mozilla/5.0 (compatible; MSIE 9.0; Windows NT ... Simpson, Williams and Pham 675957666125 08/19 699 JCB 16 digit Customer service manager de 0.95
3 7780 Julia Fords\nNew Stacy, WA 45798 36 vm PM Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0 ... Williams, Marshall and Buchanan 6011578504430710 02/24 384 Discover Drilling engineer es 78.04
4 23012 Munoz Drive Suite 337\nNew Cynthia, TX 5... 20 IE AM Opera/9.58.(X11; Linux x86_64; it-IT) Presto/2... Brown, Watson and Andrews 6011456623207998 10/25 678 Diners Club / Carte Blanche Fine artist es 77.82

How many rows and columns are there?

In [4]:

(10000, 14)

In [3]:

Credit Card CC Security Code Purchase Price
count 1.000000e+04 10000.000000 10000.000000
mean 2.341374e+15 907.217800 50.347302
std 2.256103e+15 1589.693035 29.015836
min 6.040186e+10 0.000000 0.000000
25% 3.056322e+13 280.000000 25.150000
50% 8.699942e+14 548.000000 50.505000
75% 4.492298e+15 816.000000 75.770000
max 6.012000e+15 9993.000000 99.990000

What is the average Purchase Price?

In [5]:
ecom['Purchase Price'].mean()


What were the highest and lowest purchase prices?

In [6]:
ecom['Purchase Price'].max()


In [7]:
ecom['Purchase Price'].min()


How many people have English 'en' as their Language of choice on the website?

In [12]:


How many people have the job title of "Lawyer" ?

In [14]:


How many people made the purchase during the AM and how many people made the purchase during PM ?

(Hint: Check out value_counts() )

In [15]:
ecom['AM or PM'].value_counts()

PM    5068
AM    4932
Name: AM or PM, dtype: int64

What are the 5 most common Job Titles?

In [16]:

Interior and spatial designer        31
Lawyer                               30
Social researcher                    28
Designer, jewellery                  27
Research officer, political party    27
Name: Job, dtype: int64

Someone made a purchase that came from Lot: "90 WT" , what was the Purchase Price for this transaction?

In [17]:
ecom[ecom['Lot']=='90 WT']['Purchase Price']

513    75.1
Name: Purchase Price, dtype: float64

What is the email of the person with the following Credit Card Number: 4926535242672853

In [18]:
ecom[ecom['Credit Card']==4926535242672853]['Email']

Name: Email, dtype: object

How many people have American Express as their Credit Card Provider and made a purchase above $95 ?

In [19]:
len(ecom[(ecom['CC Provider']=='American Express')&(ecom['Purchase Price']>95)])


Hard: How many people have a credit card that expires in 2025?

In [25]:
def expires_in_2025(exp_date):
    if exp_date.split('/')[1]=='25':
        return True
        return False

In [26]:
sum(ecom['CC Exp Date'].apply(expires_in_2025))


Hard: What are the top 5 most popular email providers/hosts (e.g.,, etc...)

In [27]:
ecom['Email'].apply(lambda x : x.split('@')[-1]).value_counts().head(5)

Out[27]:     1638       1616       1605         42      37
Name: Email, dtype: int64

Great Job!