Check Homework HW05

Use this notebook to check your solutions. This notebook will not be graded.


In [ ]:
import pandas as pd
import numpy as np

Now, import your solutions from hw5_answers.py. The following code looks a bit redundant. However, we do this to allow reloading the hw5_answers.py in case you made some changes. Normally, Python assumes that modules don't change and therefore does not try to import them again.


In [ ]:
import hw5_answers
reload(hw5_answers)
from hw5_answers import *

The Employees, Territory, Customers, and Orders tables are the same as those we used in class.


In [ ]:
Employees = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
Territory = pd.read_excel('/home/data/AdventureWorks/SalesTerritory.xls')
Customers = pd.read_excel('/home/data/AdventureWorks/Customers.xls')
Orders = pd.read_excel('/home/data/AdventureWorks/ItemsOrdered.xls')

Problem 1

Write a function called get_manager that takes as its one argument the Pandas DataFrame "Employees" and returns a DataFrame containing list of all employees (EmployeeID, first name, middle name, last name), and their manager's first and last name. The columns in the output DataFrame should be: EmployeeID, FirstName, MiddleName, LastName, ManagerFirstName, ManagerLastName.


In [ ]:
df1 = get_manager(Employees)
print "Shape of resulting table: ", df1.shape
print "Columns: ", ', '.join(df1.columns)
df1.head()
Shape of resulting table:  (291, 6)
Columns:  EmployeeID, FirstName, MiddleName, LastName, ManagerFirstName, ManagerLastName

EmployeeID FirstName MiddleName LastName ManagerFirstName ManagerLastName
0 259 Ben T Miller Sheela Word
1 278 Garrett R Vargas Stephen Jiang
2 204 Gabe B Mares Peter Krebs
3 78 Reuben H D'sa Peter Krebs
4 255 Gordon L Hee Sheela Word

Problem 2

Write a functon called get_spend_by_order that takes as its two arguments the Pandas DataFrames "Orders" and "Customers", and returns a DataFrame with the following columns: "FirstName", "LastName", "Item", "TotalSpent", listing all cutomer names, their purchased items, and the total amount spend on that item (remember that the "Price" listed in "Orders" is the price per item).


In [ ]:
df2 = get_spend_by_order(Orders, Customers)
print "Shape of resulting table: ", df2.shape
print "Columns: ", ', '.join(df2.columns)
df2.head()
Shape of resulting table:  (32, 4)
Columns:  FirstName, LastName, Item, TotalSpent

FirstName LastName Item TotalSpent
0 Anthony Sanchez Umbrella 4.5
1 Conrad Giles Ski Poles 51.0
2 Conrad Giles Tent 88.0
3 Donald Davids Lawnchair 128.0
4 Elroy Keller Inflatable Mattress 38.0

Problem 3

Write a function called get_order_location that takes three arguments: "Orders", "Customers", and "Territory", and returns a DataFrame containing the following columns: "CustomerID", "Name", and "TotalItems", that gives, for each order, the CustomerID, the name of the territory where the order was placed, and the total number of items ordered (yes, 2 ski poles counts as 2 items).


In [ ]:
df3 = get_order_location(Orders, Customers, Territory)
print "Shape of resulting table: ", df3.shape
print "Columns: ", ', '.join(df3.columns)
df3.head()
Shape of resulting table:  (11, 3)
Columns:  CustomerID, Name, TotalItems


CustomerID Name TotalItems
0 10315 Central 1
1 10438 Central 3
2 10439 Central 2
3 10101 Northwest 6
4 10299 Northwest 2

Problem 4

Write a function called employee_info that takes one argument: "Employees", and returns a DataFrame containing the following columns: JobTitle, NumberOfEmployees, and MeanVacationHours, containing all job titles, the number of employees with that job title, and the mean number of vacation days for employees with that job title.


In [ ]:
df4 = employee_info(Employees)
print "Shape of resulting table: ", df4.shape
print "Columns: ", ', '.join(df4.columns)
df4.head()
Shape of resulting table:  (68, 3)
Columns:  JobTitle, NumberOfEmployees, MeanVacationHours

JobTitle NumberOfEmployees MeanVacationHours
0 Accountant 2 58.5
1 Accounts Manager 1 57.0
2 Accounts Payable Specialist 2 63.5
3 Accounts Receivable Specialist 3 61.0
4 Application Specialist 4 72.5

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]: