Rename this file before you work on it!
Exercise on Joins :: Part 2
Use the Adventure Works dataset to create the following reports. The dataset is availablt for download in it's original format at https://msdn.microsoft.com/en-us/library/hh403424.aspx (follow instructions to download).
Task
- write the Python Pandas expression to produce a table as described in the problem statements.
- The SQL expression may give you a hint. It also allows you to see both systems side-by-side.
- If you don't know SQL just ignore the SQL code.
1a. Provide a count of the number of salespeople in the company
SELECT COUNT(*)
FROM dbo.Employees AS e
INNER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
;
1b. Instead of a count, list the salespeople along with their territories
Show columns: EmployeeID, FirstName, LastName, TerritoryName, CountryCode, Region
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region
FROM dbo.Employees AS e
INNER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
;
1c. Now include in the result set all other non-sales employees
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region
FROM dbo.Employees AS e
LEFT OUTER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
;
1d. Sort the results with the salespeople at the top
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region
FROM dbo.Employees AS e
LEFT OUTER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
ORDER BY COALESCE(st.Region, 'ZZZZZZZZZZZZZZ')
;
Only show information for territories outside the U.S.
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region
FROM dbo.Employees AS e
LEFT OUTER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID AND st.CountryCode != 'US'
ORDER BY COALESCE(st.Region, 'ZZZZZZZZZZZZZZ')
;
Show additional columns: ManagerFirstName, ManagerLastName, ManagerJobTitle
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region, mgr.FirstName, mgr.LastName, mgr.JobTitle
FROM dbo.Employees AS e
INNER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
LEFT OUTER JOIN dbo.Employees as mgr
on e.ManagerID = mgr.EmployeeID
;
2b. Only show salespeople who report to Amy Alberts
SELECT e.EmployeeID, e.FirstName, e.LastName, st.Name, st.CountryCode, st.Region, mgr.FirstName, mgr.LastName, mgr.JobTitle
FROM dbo.Employees AS e
INNER JOIN dbo.SalesTerritory AS st
ON e.TerritoryID = st.TerritoryID
LEFT OUTER JOIN dbo.Employees as mgr
ON e.ManagerID = mgr.EmployeeID
WHERE CONCAT(mgr.FirstName, mgr.LastName) = 'AmyAlberts'
;
3a. Give a list of all customers, and the average price of items ordered for each one
Show columns: CustomerID, FirstName, LastName, City, StateName, AvgPrice
SELECT c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
,AVG(io.Price) AS AvgPrice
FROM dbo.Customers AS c
LEFT OUTER JOIN dbo.ItemsOrdered AS io
ON c.CustomerID = io.CustomerID
GROUP BY c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
;
3b. For those customers who have never ordered anything, put 0 as the average price
SELECT c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
,ISNULL(AVG(io.Price),0) AS AvgPrice
FROM dbo.Customers AS c
LEFT OUTER JOIN dbo.ItemsOrdered AS io
ON c.CustomerID = io.CustomerID
GROUP BY c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
;
3c. Show the same results, but instead of average price per customer, tell me the average "Total Price" per customer
Make sure you change the alias of the AvgPrice column accordingly
SELECT c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
,ISNULL(AVG(io.Price*io.Quantity),0) AS AvgTotalPrice
FROM dbo.Customers AS c
LEFT OUTER JOIN dbo.ItemsOrdered AS io
ON c.CustomerID = io.CustomerID
GROUP BY c.CustomerID
,c.FirstName
,c.LastName
,c.City
,c.StateName
;
4a. Someone has requested a report of all employees and their managers (if they don't have a manager, still display them)
Show columns: EmployeeID, EmployeeFullName, EmployeeTitle, BossFullName, BossTitle
SELECT emp.EmployeeID
,emp.Firstname + ' ' + emp.LastName AS EmployeeFullNameAlt
,CONCAT(emp.FirstName,' ',emp.LastName) AS EmployeeFullName
,emp.JobTitle AS EmployeeJobTitle
,CONCAT(mgr.FirstName,' ',mgr.LastName) AS BossFullName
,mgr.JobTitle AS BossJobTitle
FROM dbo.Employees AS emp
LEFT OUTER JOIN dbo.Employees AS mgr
ON emp.ManagerID = mgr.EmployeeID
;
4b. Another request from someone else has just come in, asking for a list of managers and their employees
Columns: BossFullName, BossTitle, EmployeeFullName, EmployeeTitle
Sort by BossTitle then EmployeeTitle
SELECT CONCAT(mgr.FirstName,' ',mgr.LastName) AS BossFullName
,mgr.JobTitle AS BossJobTitle
,CONCAT(emp.FirstName,' ',emp.LastName) AS EmployeeFullName
,emp.JobTitle AS EmployeeJobTitle
FROM dbo.Employees AS mgr
INNER JOIN dbo.Employees AS emp
ON emp.ManagerID = mgr.EmployeeID
ORDER BY BossFullName
;
4c. Provide a list of managers in the company
SELECT DISTINCT CONCAT(mgr.FirstName,' ',mgr.LastName) AS BossFullName
,mgr.JobTitle AS BossJobTitle
FROM dbo.Employees AS mgr
INNER JOIN dbo.Employees AS emp
ON mgr.EmployeeID = emp.ManagerID
ORDER BY BossFullName
;
5a. Show me only the salespeople who live in the U.S. and their hire dates
Columns: FullName, HireDate
SELECT e.FirstName + ' ' + e.LastName AS FullName
,e.HireDate
FROM dbo.Employees e
INNER JOIN dbo.SalesTerritory s
ON e.TerritoryID = s.TerritoryID
WHERE e.CountryName = 'United States'
5b. Show me only the salespeople who sell to U.S. customers and their hire dates, plus how much revenue their territory has generated this year
Columns: FullName, HireDate, TerritoryName, Revenue
SELECT e.FirstName + ' ' + e.LastName AS FullName
,e.HireDate
,s.Name AS TerritoryName
,s.SalesYTD AS Revenue
FROM dbo.Employees e
INNER JOIN dbo.SalesTerritory s
ON e.TerritoryID = s.TerritoryID
WHERE s.CountryCode = 'US'
6. I need a list of all the salespeople and their territories, but also show the territories that don't have any salespeople assigned.
Columns: EmployeeID, FirstName, LastName, TerritoryName, TerritoryCountry, TerritoryRegion
SELECT e.EmployeeID
,e.FirstName
,e.LastName
,s.Name AS TerritoryName
,s.CountryCode AS TerritoryCountry
,s.Region AS TerritoryRegion
FROM dbo.SalesTerritory AS s
LEFT OUTER JOIN dbo.Employees AS e
ON s.TerritoryID = e.TerritoryID
;
7. Give me a list of all the customers, and if they ordered anything less than $30 show me those orders too
SELECT c.CustomerID
,c.FirstName
,c.LastName
,io.Item
,io.Price
FROM dbo.Customers AS c
LEFT OUTER JOIN dbo.ItemsOrdered AS io
ON c.CustomerID = io.CustomerID
AND io.Price < 30
;
9a. We need a list of each employee's boss's boss (i.e., 2 levels up)
Columns: EmployeeID, EmployeeFullName, EmployeeTitle, BossBossFullName, BossBossTitle
Sort by BossBossTitle and then EmployeeTitle
SELECT emp.EmployeeID
,emp.FirstName + ' ' + emp.LastName AS EmployeeFullName
,emp.JobTitle AS EmployeeTitle
,mgr2.FirstName + ' ' + mgr2.LastName AS BossBossFullName
,mgr2.JobTitle AS BossBossTitle
FROM dbo.Employees emp
INNER JOIN dbo.Employees mgr
ON emp.ManagerID = mgr.EmployeeID
INNER JOIN dbo.Employees mgr2
ON mgr.ManagerID = mgr2.EmployeeID
ORDER BY BossBossTitle, EmployeeTitle
;
9b. Same as above, except show all employees in the company
SELECT emp.EmployeeID
,emp.FirstName + ' ' + emp.LastName AS EmployeeFullName
,emp.JobTitle AS EmployeeTitle
,mgr2.FirstName + ' ' + mgr2.LastName AS BossBossFullName
,mgr2.JobTitle AS BossBossTitle
FROM dbo.Employees emp
LEFT OUTER JOIN dbo.Employees mgr
ON emp.ManagerID = mgr.EmployeeID
LEFT OUTER JOIN dbo.Employees mgr2
ON mgr.ManagerID = mgr2.EmployeeID
ORDER BY BossBossTitle, EmployeeTitle
;
9c. For those employees who don't have a boss's boss, display "n/a" for BossBossFullName and BossBossTitle
SELECT emp.EmployeeID
,emp.FirstName + ' ' + emp.LastName AS EmployeeFullName
,emp.JobTitle AS EmployeeTitle
,COALESCE(mgr2.FirstName + ' ' + mgr2.LastName, 'n/a') AS BossBossFullName
,COALESCE(mgr2.JobTitle, 'n/a') AS BossBossTitle
FROM dbo.Employees emp
LEFT OUTER JOIN dbo.Employees mgr
ON emp.ManagerID = mgr.EmployeeID
LEFT OUTER JOIN dbo.Employees mgr2
ON mgr.ManagerID = mgr2.EmployeeID
ORDER BY BossBossTitle, EmployeeTitle
;
9d. Change the sort so that the ones with "n/a" are at the top of the list
SELECT emp.EmployeeID
,emp.FirstName + ' ' + emp.LastName AS EmployeeFullName
,emp.JobTitle AS EmployeeTitle
,COALESCE(mgr2.FirstName + ' ' + mgr2.LastName, 'n/a') AS BossBossFullName
,COALESCE(mgr2.JobTitle, 'n/a') AS BossBossTitle
FROM dbo.Employees emp
LEFT OUTER JOIN dbo.Employees mgr
ON emp.ManagerID = mgr.EmployeeID
LEFT OUTER JOIN dbo.Employees mgr2
ON mgr.ManagerID = mgr2.EmployeeID
ORDER BY mgr2.JobTitle, EmployeeTitle
;
10a. We need a list of each manager's indirect reports (i.e., 2 levels down)
Columns: ManagerFullName, ManagerTitle, IndirectReportFullName, IndirectReportTitle
Sort by ManagerTitle and IndirectReportTitle
SELECT mgr.FirstName + ' ' + mgr.LastName AS ManagerFullName
,mgr.JobTitle AS ManagerTitle
,COUNT(*) AS NumIndirectReports
FROM dbo.Employees mgr
INNER JOIN dbo.Employees emp
ON mgr.EmployeeID = emp.ManagerID
INNER JOIN dbo.Employees emp2
ON emp.EmployeeID = emp2.ManagerID
GROUP BY mgr.EmployeeID
,mgr.FirstName
,mgr.LastName
,mgr.JobTitle
ORDER BY ManagerTitle
;
10b. Please tell us how many indirect reports each manager has
Columns: ManagerFullName, ManagerTitle, NumIndirectReports
SELECT mgr.FirstName + ' ' + mgr.LastName AS ManagerFullName
,mgr.JobTitle AS ManagerTitle
,COUNT(*) AS NumIndirectReports
FROM dbo.Employees mgr
INNER JOIN dbo.Employees emp
ON mgr.EmployeeID = emp.ManagerID
INNER JOIN dbo.Employees emp2
ON emp.EmployeeID = emp2.ManagerID
GROUP BY mgr.EmployeeID
,mgr.FirstName
,mgr.LastName
,mgr.JobTitle
ORDER BY ManagerTitle
;
11. Start with the query from #4c and change it so it returns just one number in the result: the count of managers in the company
SELECT COUNT(DISTINCT mgr.EmployeeID) AS NumManagers
FROM dbo.Employees emp
INNER JOIN dbo.Employees mgr
ON emp.ManagerID = mgr.EmployeeID
BONUS 1a. The HR people have come to you with a problem, they realized that some of the employee names were entered into the system with 2 spaces instead of 1
Find anyone with a FirstName, MiddleName, or LastName that has 2 spaces
Columns: EmployeeID, FirstName, MiddleName, LastName
SELECT e.EmployeeID
,e.FirstName
,e.MiddleName
,e.LastName
FROM dbo.Employees e
WHERE e.FirstName LIKE '% %'
OR e.MiddleName LIKE '% %'
OR e.LastName LIKE '% %'
BONUS 1b. Display the same list but correct the names to display what they should look like
SELECT e.EmailAddress
,SUBSTRING(e.EmailAddress, CHARINDEX('@', e.EmailAddress)+1, 1000)
FROM dbo.Employees as e
;