Importing pandas package that we´ll be working with
In [1]:
import pandas as pd
Reading ecommerce purchases CSV files, and loading it into the 'ecom' dataframe
In [2]:
ecom = pd.read_csv('Ecommerce Purchases')
Checking the head of the DataFrame
In [3]:
ecom.head()
Out[3]:
Getting some basic info about the data using the info() function
In [4]:
ecom.info()
Finding the average price
In [7]:
ecom['Purchase Price'].mean()
Out[7]:
What is the highest and lowest purchase value?
In [8]:
ecom['Purchase Price'].max()
Out[8]:
In [9]:
ecom['Purchase Price'].min()
Out[9]:
Number of people who chose English as their preferable website language
In [14]:
ecom[ecom['Language'] == 'en'].count()
Out[14]:
Number of people who have their job title of 'Lawyer"
In [17]:
ecom[ecom['Job'] == 'Lawyer'].count()
Out[17]:
Number of people who made the purchase during AM and number of people who made a purchase during PM
In [22]:
ecom['AM or PM'].value_counts()
Out[22]:
The 5 most common Job Titles
In [24]:
ecom['Job'].value_counts().head(5)
Out[24]:
Getting data from specific column. Purchase price from Lot = '90 WT'
In [29]:
ecom[ecom['Lot'] == '90 WT']['Purchase Price']
Out[29]:
Email of the person with the following Credit Card Number: 4926535242672853
In [32]:
ecom[ecom['Credit Card'] == 4926535242672853]['Email']
Out[32]:
Number of people who have American Express as their Credit Card Provider and made a purchase above $95?
In [36]:
ecom[(ecom['CC Provider'] == 'American Express') & (ecom['Purchase Price'] > 95)].count()
Out[36]:
Number of people that have a credit card that expires in 2025?
In [47]:
ecom[ecom['CC Exp Date'].str.contains('25')].count()[0]
Out[47]:
Find the top 5 most popular email providers/hosts (e.g. gmail.com, yahoo.com, etc...)
In [57]:
def format_email(email):
return email.split('@')[1]
#format_email('user@gmail.com')
ecom['hosts'] = ecom['Email']
ecom['hosts'] = ecom['hosts'].apply(lambda s: format_email(s))
ecom['hosts'].value_counts().head(5)
Out[57]: