In [ ]:
import pandas as pd
import numpy as np
Now, import your solutions from hw5_answers.py
. The following code looks a bit redundant. However, we do this to allow reloading the hw5_answers.py
in case you made some changes. Normally, Python assumes that modules don't change and therefore does not try to import them again.
In [ ]:
import hw5_answers
reload(hw5_answers)
from hw5_answers import *
The Employees, Territory, Customers, and Orders tables are the same as those we used in class.
In [ ]:
Employees = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
Territory = pd.read_excel('/home/data/AdventureWorks/SalesTerritory.xls')
Customers = pd.read_excel('/home/data/AdventureWorks/Customers.xls')
Orders = pd.read_excel('/home/data/AdventureWorks/ItemsOrdered.xls')
Write a function called get_manager
that takes as its one argument the Pandas DataFrame "Employees" and returns a DataFrame containing list of all employees (EmployeeID, first name, middle name, last name), and their manager's first and last name. The columns in the output DataFrame should be: EmployeeID, FirstName, MiddleName, LastName, ManagerFirstName, ManagerLastName.
In [ ]:
df1 = get_manager(Employees)
print "Shape of resulting table: ", df1.shape
print "Columns: ", ', '.join(df1.columns)
df1.head()
Shape of resulting table: (291, 6)
Columns: EmployeeID, FirstName, MiddleName, LastName, ManagerFirstName, ManagerLastName
EmployeeID | FirstName | MiddleName | LastName | ManagerFirstName | ManagerLastName | |
---|---|---|---|---|---|---|
0 | 259 | Ben | T | Miller | Sheela | Word |
1 | 278 | Garrett | R | Vargas | Stephen | Jiang |
2 | 204 | Gabe | B | Mares | Peter | Krebs |
3 | 78 | Reuben | H | D'sa | Peter | Krebs |
4 | 255 | Gordon | L | Hee | Sheela | Word |
Write a functon called get_spend_by_order
that takes as its two arguments the Pandas DataFrames "Orders" and "Customers", and returns a DataFrame with the following columns: "FirstName", "LastName", "Item", "TotalSpent", listing all cutomer names, their purchased items, and the total amount spend on that item (remember that the "Price" listed in "Orders" is the price per item).
In [ ]:
df2 = get_spend_by_order(Orders, Customers)
print "Shape of resulting table: ", df2.shape
print "Columns: ", ', '.join(df2.columns)
df2.head()
Shape of resulting table: (32, 4)
Columns: FirstName, LastName, Item, TotalSpent
FirstName | LastName | Item | TotalSpent | |
---|---|---|---|---|
0 | Anthony | Sanchez | Umbrella | 4.5 |
1 | Conrad | Giles | Ski Poles | 51.0 |
2 | Conrad | Giles | Tent | 88.0 |
3 | Donald | Davids | Lawnchair | 128.0 |
4 | Elroy | Keller | Inflatable Mattress | 38.0 |
Write a function called get_order_location
that takes three arguments: "Orders", "Customers", and "Territory", and returns a DataFrame containing the following columns: "CustomerID", "Name", and "TotalItems", that gives, for each order, the CustomerID, the name of the territory where the order was placed, and the total number of items ordered (yes, 2 ski poles counts as 2 items).
In [ ]:
df3 = get_order_location(Orders, Customers, Territory)
print "Shape of resulting table: ", df3.shape
print "Columns: ", ', '.join(df3.columns)
df3.head()
Shape of resulting table: (11, 3)
Columns: CustomerID, Name, TotalItems
CustomerID | Name | TotalItems | |
---|---|---|---|
0 | 10315 | Central | 1 |
1 | 10438 | Central | 3 |
2 | 10439 | Central | 2 |
3 | 10101 | Northwest | 6 |
4 | 10299 | Northwest | 2 |
Write a function called employee_info
that takes one argument: "Employees", and returns a DataFrame containing the following columns: JobTitle, NumberOfEmployees, and MeanVacationHours, containing all job titles, the number of employees with that job title, and the mean number of vacation days for employees with that job title.
In [ ]:
df4 = employee_info(Employees)
print "Shape of resulting table: ", df4.shape
print "Columns: ", ', '.join(df4.columns)
df4.head()
Shape of resulting table: (68, 3)
Columns: JobTitle, NumberOfEmployees, MeanVacationHours
JobTitle | NumberOfEmployees | MeanVacationHours | |
---|---|---|---|
0 | Accountant | 2 | 58.5 |
1 | Accounts Manager | 1 | 57.0 |
2 | Accounts Payable Specialist | 2 | 63.5 |
3 | Accounts Receivable Specialist | 3 | 61.0 |
4 | Application Specialist | 4 | 72.5 |
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]: