Pandas Exercise

Use the Adventure Works dataset to create the following reports. The data files can be directlty accessed by all users on ARC (our Hadoop Cluster).

No need to download or copy into your home directory!

Reference to the original data set https://msdn.microsoft.com/en-us/library/hh403424.aspx

Task

write the Python Pandas expression to produce a table as described in the problem statements.
The SQL expression may give you a hint. It also allows you to see both systems side-by-side.
If you don't know SQL just ignore the SQL code.



In [1]:

    
import pandas as pd
import numpy as np



In [2]:

    
%%time
Employees = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
print("shape:", Employees.shape)









    



shape: (291, 26)
CPU times: user 74.6 ms, sys: 2.6 ms, total: 77.2 ms
Wall time: 75.7 ms



In [6]:

    
%%time
Territory = pd.read_excel('/home/data/AdventureWorks/SalesTerritory.xls')
print("shape:", Territory.shape)









    



(12, 6)
CPU times: user 10.9 ms, sys: 340 µs, total: 11.3 ms
Wall time: 10.2 ms



In [11]:

    
%%time
Customers = pd.read_excel('/home/data/AdventureWorks/Customers.xls')
print("shape:", Customers.shape)









    



shape: (17, 6)
CPU times: user 8.18 ms, sys: 1.54 ms, total: 9.72 ms
Wall time: 8.64 ms



In [10]:

    
%%time
Orders = pd.read_excel('/home/data/AdventureWorks/ItemsOrdered.xls')
print("shape:", Orders.shape)









    



shape: (32, 5)
CPU times: user 11.2 ms, sys: 499 µs, total: 11.7 ms
Wall time: 10.9 ms

Filtering (with)

1. Provide a list of employees that are married.

SELECT e.EmployeeID, e.FirstName, e.LastName FROM dbo.Employees AS e WHERE e.MaritalStatus = 'M';



In [23]:

    
Employees.MaritalStatus == 'M'









    Out[23]:





0       True
1       True
2       True
3       True
4       True
5      False
6      False
7      False
8      False
9      False
10      True
11     False
12      True
13     False
14      True
15      True
16     False
17     False
18      True
19      True
20     False
21     False
22     False
23     False
24      True
25      True
26      True
27      True
28      True
29      True
       ...  
261     True
262    False
263    False
264     True
265    False
266     True
267     True
268    False
269     True
270     True
271    False
272     True
273    False
274    False
275     True
276    False
277    False
278     True
279     True
280     True
281     True
282    False
283     True
284     True
285    False
286    False
287    False
288     True
289    False
290     True
Name: MaritalStatus, dtype: bool



In [29]:

    
Employees.loc[Employees.MaritalStatus == 'M', ['EmployeeID', 'FirstName', 'LastName', 'MaritalStatus']]









    Out[29]:






  
    
      
      EmployeeID
      FirstName
      LastName
      MaritalStatus
    
  
  
    
      0
      259
      Ben
      Miller
      M
    
    
      1
      278
      Garrett
      Vargas
      M
    
    
      2
      204
      Gabe
      Mares
      M
    
    
      3
      78
      Reuben
      D'sa
      M
    
    
      4
      255
      Gordon
      Hee
      M
    
    
      10
      10
      Michael
      Raheem
      M
    
    
      12
      175
      Reed
      Koch
      M
    
    
      14
      144
      Paul
      Singh
      M
    
    
      15
      192
      Brenda
      Diaz
      M
    
    
      18
      203
      Ken
      Myer
      M
    
    
      19
      245
      Barbara
      Moreland
      M
    
    
      24
      163
      Alex
      Nayberg
      M
    
    
      25
      118
      Don
      Hall
      M
    
    
      26
      176
      David
      Lawrence
      M
    
    
      27
      147
      Sandra
      Reátegui Alayo
      M
    
    
      28
      138
      Samantha
      Smith
      M
    
    
      29
      95
      Jim
      Scardelis
      M
    
    
      30
      231
      Jo
      Berry
      M
    
    
      31
      86
      Ryan
      Cornelsen
      M
    
    
      33
      214
      Andreas
      Berglund
      M
    
    
      34
      72
      Steven
      Selikoff
      M
    
    
      36
      107
      Christopher
      Hill
      M
    
    
      37
      269
      Dan
      Bacon
      M
    
    
      38
      187
      Yvonne
      McKay
      M
    
    
      40
      12
      Thierry
      D'Hers
      M
    
    
      42
      30
      Britta
      Simon
      M
    
    
      43
      198
      Lorraine
      Nay
      M
    
    
      45
      31
      Margie
      Shoop
      M
    
    
      48
      3
      Roberto
      Tamburello
      M
    
    
      49
      199
      Paula
      Nartker
      M
    
    
      ...
      ...
      ...
      ...
      ...
    
    
      227
      136
      Sylvester
      Valdez
      M
    
    
      231
      35
      Brandon
      Heidepriem
      M
    
    
      232
      228
      Christian
      Kleinerman
      M
    
    
      233
      223
      Sairaj
      Uddin
      M
    
    
      234
      100
      Lolan
      Song
      M
    
    
      239
      200
      Frank
      Lee
      M
    
    
      240
      279
      Tsvi
      Reiter
      M
    
    
      243
      73
      Carole
      Poland
      M
    
    
      249
      21
      Terry
      Eminhizer
      M
    
    
      252
      254
      Fukiko
      Ogisu
      M
    
    
      255
      53
      Diane
      Glimp
      M
    
    
      256
      282
      José
      Saraiva
      M
    
    
      258
      209
      Kathie
      Flood
      M
    
    
      259
      50
      Sidney
      Higa
      M
    
    
      261
      70
      David
      Ortiz
      M
    
    
      264
      140
      Prasanna
      Samarawickrama
      M
    
    
      266
      5
      Gail
      Erickson
      M
    
    
      267
      252
      Arvind
      Rao
      M
    
    
      269
      233
      Magnus
      Hedlund
      M
    
    
      270
      83
      Patrick
      Cook
      M
    
    
      272
      241
      David
      Liu
      M
    
    
      275
      239
      Mindy
      Martin
      M
    
    
      278
      38
      Kim
      Abercrombie
      M
    
    
      279
      154
      Raymond
      Sam
      M
    
    
      280
      114
      Mindaugas
      Krapauskas
      M
    
    
      281
      123
      Vamsi
      Kuppa
      M
    
    
      283
      101
      Houman
      Pournasseh
      M
    
    
      284
      221
      Chris
      Norred
      M
    
    
      288
      289
      Jae
      Pak
      M
    
    
      290
      999
      Chadwick
      Smith
      M
    
  

147 rows × 4 columns

2a. Show me a list of employees that have a lastname that begins with "R".

SELECT e.EmployeeID, e.FirstName, e.LastName FROM dbo.Employees AS e WHERE e.LastName LIKE 'R%';



In [4]:

    
filter = Employees.LastName.map(lambda x: x[0] == 'R')



In [5]:

    
Employees[filter]









    Out[5]:






  
    
      
      EmployeeID
      ManagerID
      TerritoryID
      Title
      FirstName
      MiddleName
      LastName
      Suffix
      JobTitle
      NationalIDNumber
      ...
      SickLeaveHours
      PhoneNumber
      PhoneNumberType
      EmailAddress
      AddressLine1
      AddressLine2
      City
      StateProvinceName
      PostalCode
      CountryName
    
  
  
    
      9
      124
      250.0
      NaN
      NaN
      Kim
      T
      Ralls
      NaN
      Stocker
      420776180
      ...
      69
      309-555-0129
      Work
      kim0@adventure-works.com
      1226 Shoe St.
      NaN
      Bothell
      Washington
      98011
      United States
    
    
      10
      10
      25.0
      NaN
      NaN
      Michael
      NaN
      Raheem
      NaN
      Research and Development Manager
      879342154
      ...
      64
      330-555-2568
      Work
      michael6@adventure-works.com
      1234 Seaside Way
      NaN
      San Francisco
      California
      94109
      United States
    
    
      16
      166
      26.0
      NaN
      NaN
      Jack
      S
      Richins
      NaN
      Production Supervisor - WC30
      60114406
      ...
      54
      552-555-0111
      Work
      jack0@adventure-works.com
      1356 Grove Way
      NaN
      Monroe
      Washington
      98272
      United States
    
    
      27
      147
      26.0
      NaN
      NaN
      Sandra
      NaN
      Reátegui Alayo
      NaN
      Production Technician - WC30
      604664374
      ...
      38
      896-555-0168
      Work
      sandra0@adventure-works.com
      1619 Stillman Court
      NaN
      Seattle
      Washington
      98104
      United States
    
    
      44
      133
      26.0
      NaN
      NaN
      Michael
      L
      Rothkugel
      NaN
      Production Technician - WC40
      830150469
      ...
      55
      454-555-0119
      Work
      michael1@adventure-works.com
      207 Berry Court
      NaN
      Edmonds
      Washington
      98020
      United States
    
    
      95
      44
      26.0
      NaN
      NaN
      Simon
      D
      Rapier
      NaN
      Production Technician - WC60
      276751903
      ...
      39
      963-555-0134
      Work
      simon0@adventure-works.com
      3421 Bouncing Road
      NaN
      Duvall
      Washington
      98019
      United States
    
    
      99
      65
      26.0
      NaN
      NaN
      Randy
      T
      Reeves
      NaN
      Production Technician - WC60
      8066363
      ...
      34
      961-555-0122
      Cell
      randy0@adventure-works.com
      3632 Bank Way
      NaN
      Edmonds
      Washington
      98020
      United States
    
    
      128
      145
      26.0
      NaN
      NaN
      Cynthia
      S
      Randall
      NaN
      Production Supervisor - WC30
      386315192
      ...
      54
      352-555-0138
      Cell
      cynthia0@adventure-works.com
      463 H Stagecoach Rd.
      NaN
      Kenmore
      Washington
      98028
      United States
    
    
      131
      149
      26.0
      NaN
      NaN
      Andy
      M
      Ruth
      NaN
      Production Technician - WC30
      764853868
      ...
      39
      118-555-0110
      Cell
      andy0@adventure-works.com
      4777 Rockne Drive
      NaN
      Bellevue
      Washington
      98004
      United States
    
    
      166
      74
      26.0
      NaN
      NaN
      Bjorn
      M
      Rettig
      NaN
      Production Technician - WC30
      420023788
      ...
      41
      199-555-0117
      Work
      bjorn0@adventure-works.com
      5802 Ampersand Drive
      NaN
      Seattle
      Washington
      98104
      United States
    
    
      180
      71
      26.0
      NaN
      NaN
      Michael
      Sean
      Ray
      NaN
      Production Supervisor - WC30
      578935259
      ...
      55
      156-555-0199
      Cell
      michael3@adventure-works.com
      6498 Mining Rd.
      NaN
      Seattle
      Washington
      98104
      United States
    
    
      188
      190
      26.0
      NaN
      NaN
      Robert
      J
      Rounthwaite
      NaN
      Production Technician - WC45
      456839592
      ...
      61
      589-555-0147
      Cell
      robert0@adventure-works.com
      6843 San Simeon Dr.
      NaN
      Seattle
      Washington
      98104
      United States
    
    
      219
      157
      26.0
      NaN
      NaN
      Linda
      A
      Randall
      NaN
      Production Technician - WC20
      812797414
      ...
      26
      696-555-0157
      Work
      linda1@adventure-works.com
      77 Birchwood
      NaN
      Seattle
      Washington
      98104
      United States
    
    
      240
      279
      274.0
      5.0
      NaN
      Tsvi
      Michael
      Reiter
      NaN
      Sales Representative
      716374314
      ...
      34
      664-555-0112
      Work
      tsvi0@adventure-works.com
      8291 Crossbow Way
      NaN
      Memphis
      Tennessee
      38103
      United States
    
    
      267
      252
      250.0
      NaN
      NaN
      Arvind
      B
      Rao
      NaN
      Buyer
      792847334
      ...
      50
      848-555-0163
      Cell
      arvind0@adventure-works.com
      9495 Limewood Place
      NaN
      Renton
      Washington
      98055
      United States
    
  

15 rows × 26 columns



In [12]:

    
df = Employees[Employees.LastName.map(lambda x: x[0] == 'R')]
df[['EmployeeID', 'FirstName', 'LastName']]









    Out[12]:






  
    
      
      EmployeeID
      FirstName
      LastName
    
  
  
    
      9
      124
      Kim
      Ralls
    
    
      10
      10
      Michael
      Raheem
    
    
      16
      166
      Jack
      Richins
    
    
      27
      147
      Sandra
      Reátegui Alayo
    
    
      44
      133
      Michael
      Rothkugel
    
    
      95
      44
      Simon
      Rapier
    
    
      99
      65
      Randy
      Reeves
    
    
      128
      145
      Cynthia
      Randall
    
    
      131
      149
      Andy
      Ruth
    
    
      166
      74
      Bjorn
      Rettig
    
    
      180
      71
      Michael
      Ray
    
    
      188
      190
      Robert
      Rounthwaite
    
    
      219
      157
      Linda
      Randall
    
    
      240
      279
      Tsvi
      Reiter
    
    
      267
      252
      Arvind
      Rao

2b. Show me a list of employees that have a lastname that ends with "r"

SELECT e.EmployeeID, e.FirstName, e.LastName FROM dbo.Employees AS e WHERE e.LastName LIKE '%r';



In [35]:

    
Employees.loc[Employees.LastName.map(lambda x: x[-1] == 'r'), ['EmployeeID', 'FirstName', 'LastName']]









    Out[35]:






  
    
      
      EmployeeID
      FirstName
      LastName
    
  
  
    
      0
      259
      Ben
      Miller
    
    
      8
      161
      Kirk
      Koenigsbauer
    
    
      18
      203
      Ken
      Myer
    
    
      49
      199
      Paula
      Nartker
    
    
      53
      41
      Bryan
      Baker
    
    
      56
      104
      Mary
      Baker
    
    
      64
      225
      Alan
      Brewer
    
    
      75
      156
      Lane
      Sacksteder
    
    
      95
      44
      Simon
      Rapier
    
    
      97
      96
      Elizabeth
      Keyser
    
    
      98
      229
      Lori
      Penor
    
    
      127
      39
      Ed
      Dudenhoefer
    
    
      129
      42
      James
      Kramer
    
    
      165
      117
      Chad
      Niswonger
    
    
      168
      57
      Frank
      Miller
    
    
      172
      105
      Kevin
      Homer
    
    
      173
      253
      Linda
      Meisner
    
    
      196
      7
      Dylan
      Miller
    
    
      200
      183
      Barbara
      Decker
    
    
      203
      142
      Olinda
      Turner
    
    
      216
      94
      Russell
      Hunter
    
    
      222
      273
      Brian
      Welcker
    
    
      229
      216
      Sean
      Alexander
    
    
      240
      279
      Tsvi
      Reiter
    
    
      249
      21
      Terry
      Eminhizer
    
    
      254
      262
      David
      Barber
    
    
      282
      29
      Mark
      McArthur

2c. Provide a list of employees that have a hyphenated lastname.

SELECT e.EmployeeID, e.FirstName, e.LastName FROM dbo.Employees AS e WHERE e.LastName LIKE '%-%';



In [16]:

    
help(pd.Series.between)









    



Help on function between in module pandas.core.series:

between(self, left, right, inclusive=True)
    Return boolean Series equivalent to left <= series <= right. NA values
    will be treated as False
    
    Parameters
    ----------
    left : scalar
        Left boundary
    right : scalar
        Right boundary
    
    Returns
    -------
    is_between : Series



In [14]:

    
type(Employees.LastName)
dir(type(Employees.LastName))









    Out[14]:





['T',
 '_AXIS_ALIASES',
 '_AXIS_IALIASES',
 '_AXIS_LEN',
 '_AXIS_NAMES',
 '_AXIS_NUMBERS',
 '_AXIS_ORDERS',
 '_AXIS_REVERSED',
 '_AXIS_SLICEMAP',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_prepare__',
 '__array_priority__',
 '__array_wrap__',
 '__bool__',
 '__bytes__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dict__',
 '__dir__',
 '__div__',
 '__doc__',
 '__eq__',
 '__finalize__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__int__',
 '__invert__',
 '__ipow__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__le__',
 '__len__',
 '__long__',
 '__lt__',
 '__mod__',
 '__module__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__nonzero__',
 '__or__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdiv__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__round__',
 '__rpow__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__setitem__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__unicode__',
 '__weakref__',
 '__xor__',
 '_accessors',
 '_add_numeric_operations',
 '_add_series_only_operations',
 '_add_series_or_dataframe_operations',
 '_agg_by_level',
 '_align_frame',
 '_align_series',
 '_allow_index_ops',
 '_at',
 '_binop',
 '_box_item_values',
 '_can_hold_na',
 '_check_inplace_setting',
 '_check_is_chained_assignment_possible',
 '_check_percentile',
 '_check_setitem_copy',
 '_clear_item_cache',
 '_consolidate_inplace',
 '_construct_axes_dict',
 '_construct_axes_dict_for_slice',
 '_construct_axes_dict_from',
 '_construct_axes_from_arguments',
 '_constructor',
 '_constructor_expanddim',
 '_constructor_sliced',
 '_convert',
 '_create_indexer',
 '_dir_additions',
 '_dir_deletions',
 '_expand_axes',
 '_from_axes',
 '_get_axis',
 '_get_axis_name',
 '_get_axis_number',
 '_get_axis_resolvers',
 '_get_block_manager_axis',
 '_get_bool_data',
 '_get_cacher',
 '_get_index_resolvers',
 '_get_item_cache',
 '_get_numeric_data',
 '_get_repr',
 '_get_values',
 '_get_values_tuple',
 '_get_with',
 '_iat',
 '_iget_item_cache',
 '_iloc',
 '_index',
 '_indexed_same',
 '_info_axis',
 '_info_axis_name',
 '_info_axis_number',
 '_init_mgr',
 '_internal_names',
 '_internal_names_set',
 '_is_cached',
 '_is_datelike_mixed_type',
 '_is_mixed_type',
 '_is_numeric_mixed_type',
 '_is_view',
 '_ix',
 '_ixs',
 '_loc',
 '_make_cat_accessor',
 '_make_dt_accessor',
 '_make_str_accessor',
 '_maybe_cache_changed',
 '_maybe_update_cacher',
 '_metadata',
 '_needs_reindex_multi',
 '_protect_consolidate',
 '_reduce',
 '_reindex_axes',
 '_reindex_axis',
 '_reindex_indexer',
 '_reindex_multi',
 '_reindex_with_indexers',
 '_reset_cache',
 '_reset_cacher',
 '_set_as_cached',
 '_set_axis',
 '_set_axis_name',
 '_set_is_copy',
 '_set_item',
 '_set_labels',
 '_set_name',
 '_set_subtyp',
 '_set_values',
 '_set_with',
 '_set_with_engine',
 '_setup_axes',
 '_slice',
 '_stat_axis',
 '_stat_axis_name',
 '_stat_axis_number',
 '_typ',
 '_unpickle_series_compat',
 '_update_inplace',
 '_validate_dtype',
 '_values',
 '_xs',
 'abs',
 'add',
 'add_prefix',
 'add_suffix',
 'align',
 'all',
 'any',
 'append',
 'apply',
 'argmax',
 'argmin',
 'argsort',
 'as_blocks',
 'as_matrix',
 'asfreq',
 'asobject',
 'asof',
 'astype',
 'at',
 'at_time',
 'autocorr',
 'axes',
 'base',
 'between',
 'between_time',
 'bfill',
 'blocks',
 'bool',
 'cat',
 'clip',
 'clip_lower',
 'clip_upper',
 'combine',
 'combine_first',
 'compound',
 'compress',
 'consolidate',
 'convert_objects',
 'copy',
 'corr',
 'count',
 'cov',
 'cummax',
 'cummin',
 'cumprod',
 'cumsum',
 'data',
 'describe',
 'diff',
 'div',
 'divide',
 'dot',
 'drop',
 'drop_duplicates',
 'dropna',
 'dt',
 'dtype',
 'dtypes',
 'duplicated',
 'empty',
 'eq',
 'equals',
 'ewm',
 'expanding',
 'factorize',
 'ffill',
 'fillna',
 'filter',
 'first',
 'first_valid_index',
 'flags',
 'floordiv',
 'from_array',
 'from_csv',
 'ftype',
 'ftypes',
 'ge',
 'get',
 'get_dtype_counts',
 'get_ftype_counts',
 'get_value',
 'get_values',
 'groupby',
 'gt',
 'hasnans',
 'head',
 'hist',
 'iat',
 'idxmax',
 'idxmin',
 'iget',
 'iget_value',
 'iloc',
 'imag',
 'index',
 'interpolate',
 'irow',
 'is_copy',
 'is_time_series',
 'is_unique',
 'isin',
 'isnull',
 'item',
 'items',
 'itemsize',
 'iteritems',
 'iterkv',
 'ix',
 'keys',
 'kurt',
 'kurtosis',
 'last',
 'last_valid_index',
 'le',
 'loc',
 'lt',
 'mad',
 'map',
 'mask',
 'max',
 'mean',
 'median',
 'memory_usage',
 'min',
 'mod',
 'mode',
 'mul',
 'multiply',
 'name',
 'nbytes',
 'ndim',
 'ne',
 'nlargest',
 'nonzero',
 'notnull',
 'nsmallest',
 'nunique',
 'order',
 'pct_change',
 'pipe',
 'plot',
 'pop',
 'pow',
 'prod',
 'product',
 'ptp',
 'put',
 'quantile',
 'radd',
 'rank',
 'ravel',
 'rdiv',
 'real',
 'reindex',
 'reindex_axis',
 'reindex_like',
 'rename',
 'rename_axis',
 'reorder_levels',
 'repeat',
 'replace',
 'resample',
 'reset_index',
 'reshape',
 'rfloordiv',
 'rmod',
 'rmul',
 'rolling',
 'round',
 'rpow',
 'rsub',
 'rtruediv',
 'sample',
 'searchsorted',
 'select',
 'sem',
 'set_axis',
 'set_value',
 'shape',
 'shift',
 'size',
 'skew',
 'slice_shift',
 'sort',
 'sort_index',
 'sort_values',
 'sortlevel',
 'squeeze',
 'std',
 'str',
 'strides',
 'sub',
 'subtract',
 'sum',
 'swapaxes',
 'swaplevel',
 'tail',
 'take',
 'to_clipboard',
 'to_csv',
 'to_dense',
 'to_dict',
 'to_frame',
 'to_hdf',
 'to_json',
 'to_msgpack',
 'to_period',
 'to_pickle',
 'to_sparse',
 'to_sql',
 'to_string',
 'to_timestamp',
 'to_xarray',
 'tolist',
 'transpose',
 'truediv',
 'truncate',
 'tshift',
 'tz_convert',
 'tz_localize',
 'unique',
 'unstack',
 'update',
 'valid',
 'value_counts',
 'values',
 'var',
 'view',
 'where',
 'xs']



In [36]:

    
Employees.loc[Employees.LastName.map(lambda x: '-' in x), ['EmployeeID', 'FirstName', 'LastName']]









    Out[36]:






  
    
      
      EmployeeID
      FirstName
      LastName
    
  
  
    
      114
      284
      Tete
      Mensa-Annan
    
    
      134
      180
      Katie
      McAskill-White
    
    
      176
      280
      Pamela
      Ansman-Wolfe

3a. Provide a list of employees that are on salary and have more than 35 vacation hours left.

SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.VacationHours ,e.SalariedFlag FROM dbo.Employees AS e WHERE (e.SalariedFlag = 1) AND (e.VacationHours > 35);



In [38]:

    
Employees.loc[(Employees.SalariedFlag==1) & (Employees.VacationHours>35), ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours']]









    Out[38]:






  
    
      
      EmployeeID
      FirstName
      LastName
      SalariedFlag
      VacationHours
    
  
  
    
      6
      270
      François
      Ajenstat
      1
      67
    
    
      11
      248
      Mike
      Seamans
      1
      59
    
    
      19
      245
      Barbara
      Moreland
      1
      58
    
    
      37
      269
      Dan
      Bacon
      1
      72
    
    
      57
      263
      Jean
      Trenary
      1
      65
    
    
      65
      212
      Peng
      Wu
      1
      81
    
    
      66
      227
      Gary
      Altman
      1
      86
    
    
      93
      286
      Lynn
      Tsoflias
      1
      36
    
    
      101
      26
      Peter
      Krebs
      1
      43
    
    
      106
      16
      David
      Bradley
      1
      40
    
    
      108
      268
      Ramesh
      Meyyappan
      1
      73
    
    
      114
      284
      Tete
      Mensa-Annan
      1
      39
    
    
      121
      235
      Paula
      Barreto  de Mattos
      1
      54
    
    
      123
      1
      Ken
      Sánchez
      1
      99
    
    
      125
      249
      Wendy
      Kahn
      1
      55
    
    
      130
      8
      Diane
      Margheim
      1
      62
    
    
      140
      211
      Hazem
      Abolrous
      1
      80
    
    
      145
      267
      Karen
      Berg
      1
      74
    
    
      147
      250
      Sheela
      Word
      1
      49
    
    
      151
      121
      Pilar
      Ackerman
      1
      93
    
    
      167
      271
      Dan
      Wilson
      1
      66
    
    
      169
      272
      Janaina
      Bueno
      1
      71
    
    
      196
      7
      Dylan
      Miller
      1
      61
    
    
      210
      264
      Stephanie
      Conroy
      1
      68
    
    
      223
      9
      Gigi
      Matthew
      1
      63
    
    
      232
      228
      Christian
      Kleinerman
      1
      92
    
    
      235
      275
      Michael
      Blythe
      1
      38
    
    
      272
      241
      David
      Liu
      1
      57
    
    
      273
      25
      James
      Hamilton
      1
      64
    
    
      288
      289
      Jae
      Pak
      1
      37

3b. Show the same as above but limit it to American employees.

SELECT DISTINCT CountryName FROM dbo.Employees; SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.VacationHours ,e.SalariedFlag ,e.CountryName FROM dbo.Employees AS e WHERE (e.SalariedFlag = 1) AND (e.VacationHours > 35) AND e.CountryName = 'United States';



In [ ]:

    
Employees.loc[(Employees.SalariedFlag==1) & (Employees.VacationHours>35), ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours']]

3c. Show the same as above but limit it to non-American employees.

SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.VacationHours ,e.SalariedFlag ,e.CountryName FROM dbo.Employees AS e WHERE (e.SalariedFlag = 1) AND (e.VacationHours > 5) AND e.CountryName != 'United States';



In [39]:

    
Employees.loc[
    (Employees.SalariedFlag==1) & (Employees.VacationHours>35) & (Employees.CountryName != 'United States')
    , ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours']
]









    Out[39]:






  
    
      
      EmployeeID
      FirstName
      LastName
      SalariedFlag
      VacationHours
    
  
  
    
      93
      286
      Lynn
      Tsoflias
      1
      36
    
    
      288
      289
      Jae
      Pak
      1
      37

4a. List the married employees with more than 35 vacation hours, only ones living in Washington state.

SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.VacationHours ,e.SalariedFlag ,e.StateProvinceName ,e.CountryName FROM dbo.Employees AS e WHERE e.MaritalStatus = 'M' AND e.VacationHours > 35 AND e.StateProvinceName = 'Washington' ;



In [40]:

    
Employees.loc[
    (Employees.SalariedFlag==1) & (Employees.VacationHours>35) & (Employees.StateProvinceName == 'Washington')
    , ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours', 'StateProvinceName']
]









    Out[40]:






  
    
      
      EmployeeID
      FirstName
      LastName
      SalariedFlag
      VacationHours
      StateProvinceName
    
  
  
    
      6
      270
      François
      Ajenstat
      1
      67
      Washington
    
    
      11
      248
      Mike
      Seamans
      1
      59
      Washington
    
    
      19
      245
      Barbara
      Moreland
      1
      58
      Washington
    
    
      37
      269
      Dan
      Bacon
      1
      72
      Washington
    
    
      57
      263
      Jean
      Trenary
      1
      65
      Washington
    
    
      65
      212
      Peng
      Wu
      1
      81
      Washington
    
    
      66
      227
      Gary
      Altman
      1
      86
      Washington
    
    
      101
      26
      Peter
      Krebs
      1
      43
      Washington
    
    
      106
      16
      David
      Bradley
      1
      40
      Washington
    
    
      108
      268
      Ramesh
      Meyyappan
      1
      73
      Washington
    
    
      121
      235
      Paula
      Barreto  de Mattos
      1
      54
      Washington
    
    
      123
      1
      Ken
      Sánchez
      1
      99
      Washington
    
    
      125
      249
      Wendy
      Kahn
      1
      55
      Washington
    
    
      130
      8
      Diane
      Margheim
      1
      62
      Washington
    
    
      140
      211
      Hazem
      Abolrous
      1
      80
      Washington
    
    
      145
      267
      Karen
      Berg
      1
      74
      Washington
    
    
      147
      250
      Sheela
      Word
      1
      49
      Washington
    
    
      151
      121
      Pilar
      Ackerman
      1
      93
      Washington
    
    
      167
      271
      Dan
      Wilson
      1
      66
      Washington
    
    
      169
      272
      Janaina
      Bueno
      1
      71
      Washington
    
    
      196
      7
      Dylan
      Miller
      1
      61
      Washington
    
    
      210
      264
      Stephanie
      Conroy
      1
      68
      Washington
    
    
      223
      9
      Gigi
      Matthew
      1
      63
      Washington
    
    
      232
      228
      Christian
      Kleinerman
      1
      92
      Washington
    
    
      272
      241
      David
      Liu
      1
      57
      Washington
    
    
      273
      25
      James
      Hamilton
      1
      64
      Washington

4b. Change the logic to include anyone who meets any of the 3 conditions (i.e., people who are either married, live in Washington state, or have more than 35 vacation hours left)

SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.MaritalStatus ,e.VacationHours ,e.SalariedFlag ,e.StateProvinceName ,e.CountryName FROM dbo.Employees AS e WHERE e.MaritalStatus = 'M' OR e.VacationHours > 35 OR e.StateProvinceName = 'Washington' ;



In [41]:

    
Employees.loc[
    (Employees.SalariedFlag==1) | (Employees.VacationHours>35) | (Employees.StateProvinceName == 'Washington')
    , ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours', 'StateProvinceName']
]









    Out[41]:






  
    
      
      EmployeeID
      FirstName
      LastName
      SalariedFlag
      VacationHours
      StateProvinceName
    
  
  
    
      0
      259
      Ben
      Miller
      0
      55
      Washington
    
    
      1
      278
      Garrett
      Vargas
      1
      33
      Alberta
    
    
      2
      204
      Gabe
      Mares
      0
      57
      Washington
    
    
      3
      78
      Reuben
      D'sa
      0
      72
      Washington
    
    
      4
      255
      Gordon
      Hee
      0
      52
      Washington
    
    
      5
      66
      Karan
      Khanna
      0
      28
      Washington
    
    
      6
      270
      François
      Ajenstat
      1
      67
      Washington
    
    
      7
      22
      Sariya
      Harnpadoungsataya
      0
      45
      Washington
    
    
      8
      161
      Kirk
      Koenigsbauer
      0
      74
      Washington
    
    
      9
      124
      Kim
      Ralls
      0
      98
      Washington
    
    
      10
      10
      Michael
      Raheem
      1
      16
      California
    
    
      11
      248
      Mike
      Seamans
      1
      59
      Washington
    
    
      12
      175
      Reed
      Koch
      0
      23
      Washington
    
    
      13
      155
      Fadi
      Fakhouri
      0
      16
      Washington
    
    
      14
      144
      Paul
      Singh
      0
      6
      Washington
    
    
      15
      192
      Brenda
      Diaz
      0
      71
      Washington
    
    
      16
      166
      Jack
      Richins
      0
      68
      Washington
    
    
      17
      112
      John
      Evans
      0
      91
      Washington
    
    
      18
      203
      Ken
      Myer
      0
      51
      Alberta
    
    
      19
      245
      Barbara
      Moreland
      1
      58
      Washington
    
    
      20
      257
      Eric
      Kurjan
      0
      54
      Washington
    
    
      21
      244
      Bryan
      Walton
      0
      62
      Alberta
    
    
      22
      69
      Steve
      Masters
      0
      31
      Washington
    
    
      23
      210
      Belinda
      Newman
      0
      83
      Washington
    
    
      24
      163
      Alex
      Nayberg
      0
      77
      Washington
    
    
      25
      118
      Don
      Hall
      0
      88
      Washington
    
    
      26
      176
      David
      Lawrence
      0
      29
      Washington
    
    
      27
      147
      Sandra
      Reátegui Alayo
      0
      37
      Washington
    
    
      28
      138
      Samantha
      Smith
      0
      7
      Washington
    
    
      29
      95
      Jim
      Scardelis
      0
      1
      Washington
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      261
      70
      David
      Ortiz
      0
      33
      Washington
    
    
      262
      219
      Sean
      Chai
      0
      78
      Washington
    
    
      263
      148
      Jason
      Watters
      0
      38
      Washington
    
    
      264
      140
      Prasanna
      Samarawickrama
      0
      5
      Washington
    
    
      265
      290
      Ranjit
      Varkey Chudukatil
      1
      34
      Gironde
    
    
      266
      5
      Gail
      Erickson
      1
      5
      Washington
    
    
      267
      252
      Arvind
      Rao
      0
      60
      Washington
    
    
      268
      266
      Peter
      Connelly
      0
      69
      Washington
    
    
      269
      233
      Magnus
      Hedlund
      0
      87
      Washington
    
    
      270
      83
      Patrick
      Cook
      0
      61
      Washington
    
    
      271
      251
      Mikael
      Sandberg
      0
      59
      Washington
    
    
      272
      241
      David
      Liu
      1
      57
      Washington
    
    
      273
      25
      James
      Hamilton
      1
      64
      Washington
    
    
      274
      108
      Jinghao
      Liu
      0
      77
      Washington
    
    
      275
      239
      Mindy
      Martin
      0
      51
      Washington
    
    
      276
      27
      Jo
      Brown
      0
      80
      Washington
    
    
      277
      186
      Shane
      Kim
      0
      75
      Washington
    
    
      278
      38
      Kim
      Abercrombie
      0
      24
      Washington
    
    
      279
      154
      Raymond
      Sam
      0
      10
      Washington
    
    
      280
      114
      Mindaugas
      Krapauskas
      0
      97
      Washington
    
    
      281
      123
      Vamsi
      Kuppa
      0
      95
      Washington
    
    
      282
      29
      Mark
      McArthur
      0
      19
      Washington
    
    
      283
      101
      Houman
      Pournasseh
      0
      3
      Washington
    
    
      284
      221
      Chris
      Norred
      0
      75
      Washington
    
    
      285
      18
      John
      Wood
      0
      48
      Washington
    
    
      286
      46
      Eugene
      Kogan
      0
      36
      Washington
    
    
      287
      202
      Tawana
      Nusbaum
      0
      50
      Washington
    
    
      288
      289
      Jae
      Pak
      1
      37
      England
    
    
      289
      288
      Rachel
      Valdez
      1
      35
      Hamburg
    
    
      290
      999
      Chadwick
      Smith
      0
      55
      Georgia
    
  

289 rows × 6 columns

4c. Show the same as above, but only for Production Technicians

SELECT DISTINCT JobTitle FROM dbo.Employees; --- look at job titles SELECT e.EmployeeID ,e.FirstName ,e.LastName ,e.MaritalStatus ,e.JobTitle ,e.VacationHours ,e.SalariedFlag ,e.StateProvinceName ,e.CountryName FROM dbo.Employees AS e WHERE e.JobTitle LIKE 'Production Technician%' AND (e.MaritalStatus = 'M' OR e.VacationHours > 35 OR e.StateProvinceName = 'Washington') ;



In [47]:

    
'abc'.startswith('ab')









    Out[47]:





True



In [49]:

    
Employees.loc[ Employees.JobTitle.map(lambda x: x.startswith('Production Technician'))
    & (
        (Employees.SalariedFlag==1) | (Employees.VacationHours>35) | (Employees.StateProvinceName == 'Washington')
    ), ['EmployeeID', 'FirstName', 'LastName', 'SalariedFlag', 'VacationHours', 'StateProvinceName', 'JobTitle']
]









    Out[49]:






  
    
      
      EmployeeID
      FirstName
      LastName
      SalariedFlag
      VacationHours
      StateProvinceName
      JobTitle
    
  
  
    
      2
      204
      Gabe
      Mares
      0
      57
      Washington
      Production Technician - WC40
    
    
      5
      66
      Karan
      Khanna
      0
      28
      Washington
      Production Technician - WC60
    
    
      8
      161
      Kirk
      Koenigsbauer
      0
      74
      Washington
      Production Technician - WC45
    
    
      12
      175
      Reed
      Koch
      0
      23
      Washington
      Production Technician - WC30
    
    
      13
      155
      Fadi
      Fakhouri
      0
      16
      Washington
      Production Technician - WC20
    
    
      14
      144
      Paul
      Singh
      0
      6
      Washington
      Production Technician - WC20
    
    
      17
      112
      John
      Evans
      0
      91
      Washington
      Production Technician - WC50
    
    
      18
      203
      Ken
      Myer
      0
      51
      Alberta
      Production Technician - WC40
    
    
      22
      69
      Steve
      Masters
      0
      31
      Washington
      Production Technician - WC60
    
    
      23
      210
      Belinda
      Newman
      0
      83
      Washington
      Production Technician - WC45
    
    
      24
      163
      Alex
      Nayberg
      0
      77
      Washington
      Production Technician - WC45
    
    
      25
      118
      Don
      Hall
      0
      88
      Washington
      Production Technician - WC50
    
    
      26
      176
      David
      Lawrence
      0
      29
      Washington
      Production Technician - WC30
    
    
      27
      147
      Sandra
      Reátegui Alayo
      0
      37
      Washington
      Production Technician - WC30
    
    
      28
      138
      Samantha
      Smith
      0
      7
      Washington
      Production Technician - WC20
    
    
      29
      95
      Jim
      Scardelis
      0
      1
      Washington
      Production Technician - WC50
    
    
      31
      86
      Ryan
      Cornelsen
      0
      59
      Washington
      Production Technician - WC40
    
    
      34
      72
      Steven
      Selikoff
      0
      41
      Washington
      Production Technician - WC30
    
    
      35
      162
      Laura
      Steele
      0
      75
      Washington
      Production Technician - WC45
    
    
      36
      107
      Christopher
      Hill
      0
      91
      Washington
      Production Technician - WC10
    
    
      38
      187
      Yvonne
      McKay
      0
      79
      Washington
      Production Technician - WC45
    
    
      39
      75
      Michiko
      Osada
      0
      44
      Washington
      Production Technician - WC30
    
    
      41
      116
      Michael
      Patten
      0
      98
      Washington
      Production Technician - WC50
    
    
      42
      30
      Britta
      Simon
      0
      14
      Washington
      Production Technician - WC60
    
    
      43
      198
      Lorraine
      Nay
      0
      49
      Washington
      Production Technician - WC40
    
    
      44
      133
      Michael
      Rothkugel
      0
      70
      Washington
      Production Technician - WC40
    
    
      45
      31
      Margie
      Shoop
      0
      18
      Washington
      Production Technician - WC60
    
    
      46
      168
      Garrett
      Young
      0
      34
      Washington
      Production Technician - WC30
    
    
      49
      199
      Paula
      Nartker
      0
      54
      Washington
      Production Technician - WC40
    
    
      50
      48
      Ruth
      Ellerbrock
      0
      83
      Washington
      Production Technician - WC10
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      234
      100
      Lolan
      Song
      0
      7
      Washington
      Production Technician - WC50
    
    
      236
      85
      Brian
      Goldstein
      0
      63
      Washington
      Production Technician - WC40
    
    
      237
      61
      Diane
      Tibbott
      0
      8
      Washington
      Production Technician - WC50
    
    
      238
      106
      John
      Kane
      0
      92
      Washington
      Production Technician - WC10
    
    
      239
      200
      Frank
      Lee
      0
      56
      Washington
      Production Technician - WC40
    
    
      241
      146
      Jian Shuo
      Wang
      0
      36
      Washington
      Production Technician - WC30
    
    
      242
      103
      Ebru
      Ersan
      0
      93
      Washington
      Production Technician - WC10
    
    
      243
      73
      Carole
      Poland
      0
      42
      Washington
      Production Technician - WC30
    
    
      244
      185
      Stefen
      Hesse
      0
      19
      Washington
      Production Technician - WC20
    
    
      245
      37
      Chris
      Okelberry
      0
      16
      Washington
      Production Technician - WC60
    
    
      247
      179
      Jan
      Miksovsky
      0
      30
      Washington
      Production Technician - WC30
    
    
      248
      91
      Kimberly
      Zimmerman
      0
      95
      Washington
      Production Technician - WC10
    
    
      250
      207
      Greg
      Alderson
      0
      85
      Washington
      Production Technician - WC45
    
    
      251
      82
      Jack
      Creasey
      0
      62
      Washington
      Production Technician - WC40
    
    
      253
      67
      Jay
      Adams
      0
      32
      Washington
      Production Technician - WC60
    
    
      255
      53
      Diane
      Glimp
      0
      87
      Washington
      Production Technician - WC10
    
    
      257
      169
      Susan
      Metters
      0
      31
      Washington
      Production Technician - WC30
    
    
      258
      209
      Kathie
      Flood
      0
      87
      Washington
      Production Technician - WC45
    
    
      259
      50
      Sidney
      Higa
      0
      84
      Washington
      Production Technician - WC10
    
    
      261
      70
      David
      Ortiz
      0
      33
      Washington
      Production Technician - WC60
    
    
      263
      148
      Jason
      Watters
      0
      38
      Washington
      Production Technician - WC30
    
    
      264
      140
      Prasanna
      Samarawickrama
      0
      5
      Washington
      Production Technician - WC20
    
    
      270
      83
      Patrick
      Cook
      0
      61
      Washington
      Production Technician - WC40
    
    
      278
      38
      Kim
      Abercrombie
      0
      24
      Washington
      Production Technician - WC60
    
    
      279
      154
      Raymond
      Sam
      0
      10
      Washington
      Production Technician - WC20
    
    
      280
      114
      Mindaugas
      Krapauskas
      0
      97
      Washington
      Production Technician - WC50
    
    
      282
      29
      Mark
      McArthur
      0
      19
      Washington
      Production Technician - WC60
    
    
      283
      101
      Houman
      Pournasseh
      0
      3
      Washington
      Production Technician - WC50
    
    
      286
      46
      Eugene
      Kogan
      0
      36
      Washington
      Production Technician - WC60
    
    
      287
      202
      Tawana
      Nusbaum
      0
      50
      Washington
      Production Technician - WC40
    
  

155 rows × 7 columns

5a. List all employees living in Redmond, Seattle, and Bellevue, showing EmployeeID, FirstName, LastName, and City. Sort the list alphabetically by city.

SELECT e.EmployeeID, e.FirstName, e.LastName, e.City FROM dbo.Employees AS e WHERE e.City in ('Redmond', 'Seattle', 'Bellevue') ORDER BY e.City ;



In [53]:

    
Employees.loc[Employees.City.isin(['Redmond', 'Seattle', 'Bellevue']),
              ['EmployeeID', 'FirstName', 'LastName', 'City']].sort_values('City')









    Out[53]:






  
    
      
      EmployeeID
      FirstName
      LastName
      City
    
  
  
    
      203
      142
      Olinda
      Turner
      Bellevue
    
    
      78
      89
      Patrick
      Wedge
      Bellevue
    
    
      83
      32
      Rebecca
      Laszlo
      Bellevue
    
    
      249
      21
      Terry
      Eminhizer
      Bellevue
    
    
      88
      113
      Linda
      Moschell
      Bellevue
    
    
      89
      196
      Shammi
      Mohamed
      Bellevue
    
    
      160
      34
      Suchitra
      Mohan
      Bellevue
    
    
      96
      99
      Nuan
      Yu
      Bellevue
    
    
      250
      207
      Greg
      Alderson
      Bellevue
    
    
      101
      26
      Peter
      Krebs
      Bellevue
    
    
      104
      151
      Rostislav
      Shabalin
      Bellevue
    
    
      205
      128
      Paul
      Komosinski
      Bellevue
    
    
      111
      177
      Russell
      King
      Bellevue
    
    
      223
      9
      Gigi
      Matthew
      Bellevue
    
    
      121
      235
      Paula
      Barreto  de Mattos
      Bellevue
    
    
      227
      136
      Sylvester
      Valdez
      Bellevue
    
    
      131
      149
      Andy
      Ruth
      Bellevue
    
    
      102
      93
      Kok-Ho
      Loh
      Bellevue
    
    
      73
      258
      Erin
      Hagens
      Bellevue
    
    
      167
      271
      Dan
      Wilson
      Bellevue
    
    
      263
      148
      Jason
      Watters
      Bellevue
    
    
      14
      144
      Paul
      Singh
      Bellevue
    
    
      208
      141
      Min
      Su
      Bellevue
    
    
      266
      5
      Gail
      Erickson
      Bellevue
    
    
      187
      224
      William
      Vong
      Bellevue
    
    
      172
      105
      Kevin
      Homer
      Bellevue
    
    
      179
      206
      Stuart
      Munson
      Bellevue
    
    
      195
      159
      Terrence
      Earls
      Bellevue
    
    
      28
      138
      Samantha
      Smith
      Bellevue
    
    
      175
      19
      Mary
      Dempsey
      Bellevue
    
    
      ...
      ...
      ...
      ...
      ...
    
    
      3
      78
      Reuben
      D'sa
      Seattle
    
    
      5
      66
      Karan
      Khanna
      Seattle
    
    
      8
      161
      Kirk
      Koenigsbauer
      Seattle
    
    
      15
      192
      Brenda
      Diaz
      Seattle
    
    
      17
      112
      John
      Evans
      Seattle
    
    
      19
      245
      Barbara
      Moreland
      Seattle
    
    
      22
      69
      Steve
      Masters
      Seattle
    
    
      27
      147
      Sandra
      Reátegui Alayo
      Seattle
    
    
      49
      199
      Paula
      Nartker
      Seattle
    
    
      56
      104
      Mary
      Baker
      Seattle
    
    
      69
      62
      John
      Campbell
      Seattle
    
    
      75
      156
      Lane
      Sacksteder
      Seattle
    
    
      85
      132
      Nicole
      Holliday
      Seattle
    
    
      202
      87
      Cristian
      Petculescu
      Seattle
    
    
      119
      131
      Baris
      Cetinok
      Seattle
    
    
      134
      180
      Katie
      McAskill-White
      Seattle
    
    
      136
      152
      Yuhong
      Li
      Seattle
    
    
      137
      171
      David
      Yalovsky
      Seattle
    
    
      142
      36
      Jose
      Lugo
      Seattle
    
    
      148
      60
      Pete
      Male
      Seattle
    
    
      149
      120
      Kitti
      Lertpiriyasuwat
      Seattle
    
    
      151
      121
      Pilar
      Ackerman
      Seattle
    
    
      156
      77
      Merav
      Netz
      Seattle
    
    
      180
      71
      Michael
      Ray
      Seattle
    
    
      182
      102
      Zheng
      Mu
      Seattle
    
    
      185
      165
      Chris
      Preston
      Seattle
    
    
      188
      190
      Robert
      Rounthwaite
      Seattle
    
    
      198
      40
      JoLynn
      Dobney
      Seattle
    
    
      127
      39
      Ed
      Dudenhoefer
      Seattle
    
    
      287
      202
      Tawana
      Nusbaum
      Seattle
    
  

101 rows × 4 columns

5b. For the list above, make sure these are only in the state of Washington, just to be careful.

SELECT e.EmployeeID, e.FirstName, e.LastName, e.City, e.StateProvinceName FROM dbo.Employees AS e WHERE e.City in ('Redmond', 'Seattle', 'Bellevue') AND e.StateProvinceName = 'Washington' ORDER BY e.City ;



In [54]:

    
Employees.loc[Employees.City.isin(['Redmond', 'Seattle', 'Bellevue'])
              & (Employees.StateProvinceName=='Washington'),
              ['EmployeeID', 'FirstName', 'LastName', 'City']].sort_values('City')









    Out[54]:






  
    
      
      EmployeeID
      FirstName
      LastName
      City
    
  
  
    
      203
      142
      Olinda
      Turner
      Bellevue
    
    
      78
      89
      Patrick
      Wedge
      Bellevue
    
    
      83
      32
      Rebecca
      Laszlo
      Bellevue
    
    
      249
      21
      Terry
      Eminhizer
      Bellevue
    
    
      88
      113
      Linda
      Moschell
      Bellevue
    
    
      89
      196
      Shammi
      Mohamed
      Bellevue
    
    
      160
      34
      Suchitra
      Mohan
      Bellevue
    
    
      96
      99
      Nuan
      Yu
      Bellevue
    
    
      250
      207
      Greg
      Alderson
      Bellevue
    
    
      101
      26
      Peter
      Krebs
      Bellevue
    
    
      104
      151
      Rostislav
      Shabalin
      Bellevue
    
    
      205
      128
      Paul
      Komosinski
      Bellevue
    
    
      111
      177
      Russell
      King
      Bellevue
    
    
      223
      9
      Gigi
      Matthew
      Bellevue
    
    
      121
      235
      Paula
      Barreto  de Mattos
      Bellevue
    
    
      227
      136
      Sylvester
      Valdez
      Bellevue
    
    
      131
      149
      Andy
      Ruth
      Bellevue
    
    
      102
      93
      Kok-Ho
      Loh
      Bellevue
    
    
      73
      258
      Erin
      Hagens
      Bellevue
    
    
      167
      271
      Dan
      Wilson
      Bellevue
    
    
      263
      148
      Jason
      Watters
      Bellevue
    
    
      14
      144
      Paul
      Singh
      Bellevue
    
    
      208
      141
      Min
      Su
      Bellevue
    
    
      266
      5
      Gail
      Erickson
      Bellevue
    
    
      187
      224
      William
      Vong
      Bellevue
    
    
      172
      105
      Kevin
      Homer
      Bellevue
    
    
      179
      206
      Stuart
      Munson
      Bellevue
    
    
      195
      159
      Terrence
      Earls
      Bellevue
    
    
      28
      138
      Samantha
      Smith
      Bellevue
    
    
      175
      19
      Mary
      Dempsey
      Bellevue
    
    
      ...
      ...
      ...
      ...
      ...
    
    
      3
      78
      Reuben
      D'sa
      Seattle
    
    
      5
      66
      Karan
      Khanna
      Seattle
    
    
      8
      161
      Kirk
      Koenigsbauer
      Seattle
    
    
      15
      192
      Brenda
      Diaz
      Seattle
    
    
      17
      112
      John
      Evans
      Seattle
    
    
      19
      245
      Barbara
      Moreland
      Seattle
    
    
      22
      69
      Steve
      Masters
      Seattle
    
    
      27
      147
      Sandra
      Reátegui Alayo
      Seattle
    
    
      49
      199
      Paula
      Nartker
      Seattle
    
    
      56
      104
      Mary
      Baker
      Seattle
    
    
      69
      62
      John
      Campbell
      Seattle
    
    
      75
      156
      Lane
      Sacksteder
      Seattle
    
    
      85
      132
      Nicole
      Holliday
      Seattle
    
    
      202
      87
      Cristian
      Petculescu
      Seattle
    
    
      119
      131
      Baris
      Cetinok
      Seattle
    
    
      134
      180
      Katie
      McAskill-White
      Seattle
    
    
      136
      152
      Yuhong
      Li
      Seattle
    
    
      137
      171
      David
      Yalovsky
      Seattle
    
    
      142
      36
      Jose
      Lugo
      Seattle
    
    
      148
      60
      Pete
      Male
      Seattle
    
    
      149
      120
      Kitti
      Lertpiriyasuwat
      Seattle
    
    
      151
      121
      Pilar
      Ackerman
      Seattle
    
    
      156
      77
      Merav
      Netz
      Seattle
    
    
      180
      71
      Michael
      Ray
      Seattle
    
    
      182
      102
      Zheng
      Mu
      Seattle
    
    
      185
      165
      Chris
      Preston
      Seattle
    
    
      188
      190
      Robert
      Rounthwaite
      Seattle
    
    
      198
      40
      JoLynn
      Dobney
      Seattle
    
    
      127
      39
      Ed
      Dudenhoefer
      Seattle
    
    
      287
      202
      Tawana
      Nusbaum
      Seattle
    
  

101 rows × 4 columns

6. Provide a list of employees who have no title, whether it's a NULL or empty string.

SELECT e.EmployeeID, e.Title, e.FirstName, e.LastName FROM dbo.Employees AS e WHERE e.Title IS NULL OR e.Title = '' ;



In [93]:

    
### Note: when exporting the SQL table to Excel the one row with a ' ' in the Title was already removed...Sorry

Employees.loc[pd.isnull(Employees.Title) | Employees.Title.map(lambda x: len(str(x).strip())==0)
              ,['EmployeeID', 'Title', 'FirstName', 'LastName', 'City']]









    Out[93]:






  
    
      
      EmployeeID
      Title
      FirstName
      LastName
      City
    
  
  
    
      0
      259
      NaN
      Ben
      Miller
      Redmond
    
    
      1
      278
      NaN
      Garrett
      Vargas
      Calgary
    
    
      2
      204
      NaN
      Gabe
      Mares
      Edmonds
    
    
      3
      78
      NaN
      Reuben
      D'sa
      Seattle
    
    
      4
      255
      NaN
      Gordon
      Hee
      Bellevue
    
    
      5
      66
      NaN
      Karan
      Khanna
      Seattle
    
    
      6
      270
      NaN
      François
      Ajenstat
      Issaquah
    
    
      7
      22
      NaN
      Sariya
      Harnpadoungsataya
      Everett
    
    
      8
      161
      NaN
      Kirk
      Koenigsbauer
      Seattle
    
    
      9
      124
      NaN
      Kim
      Ralls
      Bothell
    
    
      10
      10
      NaN
      Michael
      Raheem
      San Francisco
    
    
      11
      248
      NaN
      Mike
      Seamans
      Index
    
    
      12
      175
      NaN
      Reed
      Koch
      Redmond
    
    
      13
      155
      NaN
      Fadi
      Fakhouri
      Snohomish
    
    
      14
      144
      NaN
      Paul
      Singh
      Bellevue
    
    
      15
      192
      NaN
      Brenda
      Diaz
      Seattle
    
    
      16
      166
      NaN
      Jack
      Richins
      Monroe
    
    
      17
      112
      NaN
      John
      Evans
      Seattle
    
    
      18
      203
      NaN
      Ken
      Myer
      Calgary
    
    
      19
      245
      NaN
      Barbara
      Moreland
      Seattle
    
    
      20
      257
      NaN
      Eric
      Kurjan
      Renton
    
    
      21
      244
      NaN
      Bryan
      Walton
      Calgary
    
    
      22
      69
      NaN
      Steve
      Masters
      Seattle
    
    
      23
      210
      NaN
      Belinda
      Newman
      Bothell
    
    
      24
      163
      NaN
      Alex
      Nayberg
      Newport Hills
    
    
      25
      118
      NaN
      Don
      Hall
      Carnation
    
    
      26
      176
      NaN
      David
      Lawrence
      Monroe
    
    
      27
      147
      NaN
      Sandra
      Reátegui Alayo
      Seattle
    
    
      28
      138
      NaN
      Samantha
      Smith
      Bellevue
    
    
      29
      95
      NaN
      Jim
      Scardelis
      Edmonds
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      260
      222
      NaN
      A.  Scott
      Wright
      Newport Hills
    
    
      261
      70
      NaN
      David
      Ortiz
      Seattle
    
    
      262
      219
      NaN
      Sean
      Chai
      Issaquah
    
    
      263
      148
      NaN
      Jason
      Watters
      Bellevue
    
    
      264
      140
      NaN
      Prasanna
      Samarawickrama
      Redmond
    
    
      265
      290
      NaN
      Ranjit
      Varkey Chudukatil
      Bordeaux
    
    
      267
      252
      NaN
      Arvind
      Rao
      Renton
    
    
      268
      266
      NaN
      Peter
      Connelly
      Issaquah
    
    
      269
      233
      NaN
      Magnus
      Hedlund
      Renton
    
    
      270
      83
      NaN
      Patrick
      Cook
      Seattle
    
    
      271
      251
      NaN
      Mikael
      Sandberg
      Bothell
    
    
      272
      241
      NaN
      David
      Liu
      Gold Bar
    
    
      273
      25
      NaN
      James
      Hamilton
      Monroe
    
    
      274
      108
      NaN
      Jinghao
      Liu
      Carnation
    
    
      275
      239
      NaN
      Mindy
      Martin
      Newport Hills
    
    
      276
      27
      NaN
      Jo
      Brown
      Duvall
    
    
      277
      186
      NaN
      Shane
      Kim
      Bellevue
    
    
      278
      38
      NaN
      Kim
      Abercrombie
      Carnation
    
    
      279
      154
      NaN
      Raymond
      Sam
      Edmonds
    
    
      280
      114
      NaN
      Mindaugas
      Krapauskas
      Edmonds
    
    
      281
      123
      NaN
      Vamsi
      Kuppa
      Bothell
    
    
      282
      29
      NaN
      Mark
      McArthur
      Redmond
    
    
      283
      101
      NaN
      Houman
      Pournasseh
      Redmond
    
    
      284
      221
      NaN
      Chris
      Norred
      Issaquah
    
    
      285
      18
      NaN
      John
      Wood
      Redmond
    
    
      286
      46
      NaN
      Eugene
      Kogan
      Duvall
    
    
      287
      202
      NaN
      Tawana
      Nusbaum
      Seattle
    
    
      288
      289
      NaN
      Jae
      Pak
      Cambridge
    
    
      289
      288
      NaN
      Rachel
      Valdez
      Berlin
    
    
      290
      999
      NaN
      Chadwick
      Smith
      Atlanta
    
  

283 rows × 5 columns

7a. Provide a list of employees who have at least 60 vacation hours left.

SELECT e.EmployeeID, e.Title, e.FirstName, e.LastName, e.VacationHours, e.SickLeaveHours FROM dbo.Employees AS e WHERE e.VacationHours >= 60 ;



In [111]:

    
Employees.loc[Employees.VacationHours >= 60
              , ['EmployeeID', 'FirstName', 'LastName', 'VacationHours', 'SickLeaveHours']]









    Out[111]:






  
    
      
      EmployeeID
      FirstName
      LastName
      VacationHours
      SickLeaveHours
    
  
  
    
      3
      78
      Reuben
      D'sa
      72
      56
    
    
      6
      270
      François
      Ajenstat
      67
      53
    
    
      8
      161
      Kirk
      Koenigsbauer
      74
      57
    
    
      9
      124
      Kim
      Ralls
      98
      69
    
    
      15
      192
      Brenda
      Diaz
      71
      55
    
    
      16
      166
      Jack
      Richins
      68
      54
    
    
      17
      112
      John
      Evans
      91
      65
    
    
      21
      244
      Bryan
      Walton
      62
      51
    
    
      23
      210
      Belinda
      Newman
      83
      61
    
    
      24
      163
      Alex
      Nayberg
      77
      58
    
    
      25
      118
      Don
      Hall
      88
      64
    
    
      30
      231
      Jo
      Berry
      91
      65
    
    
      33
      214
      Andreas
      Berglund
      84
      62
    
    
      35
      162
      Laura
      Steele
      75
      57
    
    
      36
      107
      Christopher
      Hill
      91
      65
    
    
      37
      269
      Dan
      Bacon
      72
      56
    
    
      38
      187
      Yvonne
      McKay
      79
      59
    
    
      41
      116
      Michael
      Patten
      98
      69
    
    
      44
      133
      Michael
      Rothkugel
      70
      55
    
    
      50
      48
      Ruth
      Ellerbrock
      83
      61
    
    
      51
      126
      Jimmy
      Bischoff
      96
      68
    
    
      52
      230
      Stuart
      Macrae
      88
      64
    
    
      56
      104
      Mary
      Baker
      94
      67
    
    
      57
      263
      Jean
      Trenary
      65
      52
    
    
      58
      232
      Pat
      Coleman
      89
      64
    
    
      60
      81
      Mihail
      Frintu
      64
      52
    
    
      62
      119
      Michael
      Entin
      94
      67
    
    
      65
      212
      Peng
      Wu
      81
      60
    
    
      66
      227
      Gary
      Altman
      86
      63
    
    
      69
      62
      John
      Campbell
      81
      60
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      205
      128
      Paul
      Komosinski
      68
      54
    
    
      209
      208
      Scott
      Gode
      86
      63
    
    
      210
      264
      Stephanie
      Conroy
      68
      54
    
    
      217
      242
      Deborah
      Poe
      60
      50
    
    
      221
      191
      Lionel
      Penuchot
      78
      59
    
    
      223
      9
      Gigi
      Matthew
      63
      51
    
    
      229
      216
      Sean
      Alexander
      82
      61
    
    
      232
      228
      Christian
      Kleinerman
      92
      66
    
    
      236
      85
      Brian
      Goldstein
      63
      51
    
    
      238
      106
      John
      Kane
      92
      66
    
    
      242
      103
      Ebru
      Ersan
      93
      66
    
    
      246
      215
      Mark
      Harrington
      83
      61
    
    
      248
      91
      Kimberly
      Zimmerman
      95
      67
    
    
      250
      207
      Greg
      Alderson
      85
      62
    
    
      251
      82
      Jack
      Creasey
      62
      51
    
    
      255
      53
      Diane
      Glimp
      87
      63
    
    
      258
      209
      Kathie
      Flood
      87
      63
    
    
      259
      50
      Sidney
      Higa
      84
      62
    
    
      262
      219
      Sean
      Chai
      78
      59
    
    
      267
      252
      Arvind
      Rao
      60
      50
    
    
      268
      266
      Peter
      Connelly
      69
      54
    
    
      269
      233
      Magnus
      Hedlund
      87
      63
    
    
      270
      83
      Patrick
      Cook
      61
      50
    
    
      273
      25
      James
      Hamilton
      64
      52
    
    
      274
      108
      Jinghao
      Liu
      77
      58
    
    
      276
      27
      Jo
      Brown
      80
      60
    
    
      277
      186
      Shane
      Kim
      75
      57
    
    
      280
      114
      Mindaugas
      Krapauskas
      97
      68
    
    
      281
      123
      Vamsi
      Kuppa
      95
      67
    
    
      284
      221
      Chris
      Norred
      75
      57
    
  

120 rows × 5 columns

7b. Provide a list of employees who have less than 60 vacation hours left.

SELECT e.EmployeeID, e.Title, e.FirstName, e.LastName, e.VacationHours, e.SickLeaveHours FROM dbo.Employees AS e WHERE e.VacationHours < 60 ;



In [106]:

    
Employees.loc[Employees.VacationHours < 60
              , ['EmployeeID', 'FirstName', 'LastName', 'VacationHours', 'SickLeaveHours']]









    Out[106]:






  
    
      
      EmployeeID
      Title
      FirstName
      LastName
      VacationHours
      SickLeaveHours
    
  
  
    
      0
      259
      NaN
      Ben
      Miller
      55
      47
    
    
      1
      278
      NaN
      Garrett
      Vargas
      33
      36
    
    
      2
      204
      NaN
      Gabe
      Mares
      57
      48
    
    
      4
      255
      NaN
      Gordon
      Hee
      52
      46
    
    
      5
      66
      NaN
      Karan
      Khanna
      28
      34
    
    
      7
      22
      NaN
      Sariya
      Harnpadoungsataya
      45
      42
    
    
      10
      10
      NaN
      Michael
      Raheem
      16
      64
    
    
      11
      248
      NaN
      Mike
      Seamans
      59
      49
    
    
      12
      175
      NaN
      Reed
      Koch
      23
      31
    
    
      13
      155
      NaN
      Fadi
      Fakhouri
      16
      28
    
    
      14
      144
      NaN
      Paul
      Singh
      6
      23
    
    
      18
      203
      NaN
      Ken
      Myer
      51
      45
    
    
      19
      245
      NaN
      Barbara
      Moreland
      58
      49
    
    
      20
      257
      NaN
      Eric
      Kurjan
      54
      47
    
    
      22
      69
      NaN
      Steve
      Masters
      31
      35
    
    
      26
      176
      NaN
      David
      Lawrence
      29
      34
    
    
      27
      147
      NaN
      Sandra
      Reátegui Alayo
      37
      38
    
    
      28
      138
      NaN
      Samantha
      Smith
      7
      23
    
    
      29
      95
      NaN
      Jim
      Scardelis
      1
      20
    
    
      31
      86
      NaN
      Ryan
      Cornelsen
      59
      49
    
    
      32
      181
      NaN
      Michael
      Hines
      20
      30
    
    
      34
      72
      NaN
      Steven
      Selikoff
      41
      40
    
    
      39
      75
      NaN
      Michiko
      Osada
      44
      42
    
    
      40
      12
      NaN
      Thierry
      D'Hers
      9
      24
    
    
      42
      30
      NaN
      Britta
      Simon
      14
      27
    
    
      43
      198
      NaN
      Lorraine
      Nay
      49
      44
    
    
      45
      31
      NaN
      Margie
      Shoop
      18
      29
    
    
      46
      168
      NaN
      Garrett
      Young
      34
      37
    
    
      47
      236
      NaN
      Grant
      Culbertson
      53
      46
    
    
      48
      3
      NaN
      Roberto
      Tamburello
      2
      21
    
    
      ...
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      241
      146
      NaN
      Jian Shuo
      Wang
      36
      38
    
    
      243
      73
      NaN
      Carole
      Poland
      42
      41
    
    
      244
      185
      NaN
      Stefen
      Hesse
      19
      29
    
    
      245
      37
      NaN
      Chris
      Okelberry
      16
      28
    
    
      247
      179
      NaN
      Jan
      Miksovsky
      30
      35
    
    
      249
      21
      NaN
      Terry
      Eminhizer
      44
      42
    
    
      252
      254
      NaN
      Fukiko
      Ogisu
      57
      48
    
    
      253
      67
      NaN
      Jay
      Adams
      32
      36
    
    
      254
      262
      NaN
      David
      Barber
      56
      48
    
    
      256
      282
      NaN
      José
      Saraiva
      31
      35
    
    
      257
      169
      NaN
      Susan
      Metters
      31
      35
    
    
      260
      222
      NaN
      A.  Scott
      Wright
      44
      42
    
    
      261
      70
      NaN
      David
      Ortiz
      33
      36
    
    
      263
      148
      NaN
      Jason
      Watters
      38
      39
    
    
      264
      140
      NaN
      Prasanna
      Samarawickrama
      5
      22
    
    
      265
      290
      NaN
      Ranjit
      Varkey Chudukatil
      34
      37
    
    
      266
      5
      Ms.
      Gail
      Erickson
      5
      22
    
    
      271
      251
      NaN
      Mikael
      Sandberg
      59
      49
    
    
      272
      241
      NaN
      David
      Liu
      57
      48
    
    
      275
      239
      NaN
      Mindy
      Martin
      51
      45
    
    
      278
      38
      NaN
      Kim
      Abercrombie
      24
      32
    
    
      279
      154
      NaN
      Raymond
      Sam
      10
      25
    
    
      282
      29
      NaN
      Mark
      McArthur
      19
      29
    
    
      283
      101
      NaN
      Houman
      Pournasseh
      3
      21
    
    
      285
      18
      NaN
      John
      Wood
      48
      44
    
    
      286
      46
      NaN
      Eugene
      Kogan
      36
      38
    
    
      287
      202
      NaN
      Tawana
      Nusbaum
      50
      45
    
    
      288
      289
      NaN
      Jae
      Pak
      37
      38
    
    
      289
      288
      NaN
      Rachel
      Valdez
      35
      37
    
    
      290
      999
      NaN
      Chadwick
      Smith
      55
      47
    
  

171 rows × 6 columns

7c. Show me employees who have more than 20 and less than 60 vacation hours left.

SELECT e.EmployeeID, e.Title, e.FirstName, e.LastName, e.VacationHours, e.SickLeaveHours FROM dbo.Employees AS e WHERE e.VacationHours BETWEEN 21 AND 59 ;



In [112]:

    
Employees.loc[(Employees.VacationHours > 20) & (Employees.VacationHours < 60)
              , ['EmployeeID', 'FirstName', 'LastName', 'VacationHours', 'SickLeaveHours']]









    Out[112]:






  
    
      
      EmployeeID
      FirstName
      LastName
      VacationHours
      SickLeaveHours
    
  
  
    
      0
      259
      Ben
      Miller
      55
      47
    
    
      1
      278
      Garrett
      Vargas
      33
      36
    
    
      2
      204
      Gabe
      Mares
      57
      48
    
    
      4
      255
      Gordon
      Hee
      52
      46
    
    
      5
      66
      Karan
      Khanna
      28
      34
    
    
      7
      22
      Sariya
      Harnpadoungsataya
      45
      42
    
    
      11
      248
      Mike
      Seamans
      59
      49
    
    
      12
      175
      Reed
      Koch
      23
      31
    
    
      18
      203
      Ken
      Myer
      51
      45
    
    
      19
      245
      Barbara
      Moreland
      58
      49
    
    
      20
      257
      Eric
      Kurjan
      54
      47
    
    
      22
      69
      Steve
      Masters
      31
      35
    
    
      26
      176
      David
      Lawrence
      29
      34
    
    
      27
      147
      Sandra
      Reátegui Alayo
      37
      38
    
    
      31
      86
      Ryan
      Cornelsen
      59
      49
    
    
      34
      72
      Steven
      Selikoff
      41
      40
    
    
      39
      75
      Michiko
      Osada
      44
      42
    
    
      43
      198
      Lorraine
      Nay
      49
      44
    
    
      46
      168
      Garrett
      Young
      34
      37
    
    
      47
      236
      Grant
      Culbertson
      53
      46
    
    
      49
      199
      Paula
      Nartker
      54
      47
    
    
      53
      41
      Bryan
      Baker
      35
      37
    
    
      54
      283
      David
      Campbell
      23
      31
    
    
      55
      226
      Brian
      LaMee
      48
      44
    
    
      61
      172
      Marc
      Ingle
      22
      31
    
    
      63
      276
      Linda
      Mitchell
      27
      33
    
    
      64
      225
      Alan
      Brewer
      47
      43
    
    
      68
      194
      Fred
      Northup
      47
      43
    
    
      72
      150
      Michael
      Vanderhyde
      35
      37
    
    
      73
      258
      Erin
      Hagens
      53
      46
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      228
      173
      Eugene
      Zabokritski
      33
      36
    
    
      230
      277
      Jillian
      Carson
      24
      32
    
    
      231
      35
      Brandon
      Heidepriem
      22
      31
    
    
      233
      223
      Sairaj
      Uddin
      46
      43
    
    
      235
      275
      Michael
      Blythe
      38
      39
    
    
      239
      200
      Frank
      Lee
      56
      48
    
    
      240
      279
      Tsvi
      Reiter
      29
      34
    
    
      241
      146
      Jian Shuo
      Wang
      36
      38
    
    
      243
      73
      Carole
      Poland
      42
      41
    
    
      247
      179
      Jan
      Miksovsky
      30
      35
    
    
      249
      21
      Terry
      Eminhizer
      44
      42
    
    
      252
      254
      Fukiko
      Ogisu
      57
      48
    
    
      253
      67
      Jay
      Adams
      32
      36
    
    
      254
      262
      David
      Barber
      56
      48
    
    
      256
      282
      José
      Saraiva
      31
      35
    
    
      257
      169
      Susan
      Metters
      31
      35
    
    
      260
      222
      A.  Scott
      Wright
      44
      42
    
    
      261
      70
      David
      Ortiz
      33
      36
    
    
      263
      148
      Jason
      Watters
      38
      39
    
    
      265
      290
      Ranjit
      Varkey Chudukatil
      34
      37
    
    
      271
      251
      Mikael
      Sandberg
      59
      49
    
    
      272
      241
      David
      Liu
      57
      48
    
    
      275
      239
      Mindy
      Martin
      51
      45
    
    
      278
      38
      Kim
      Abercrombie
      24
      32
    
    
      285
      18
      John
      Wood
      48
      44
    
    
      286
      46
      Eugene
      Kogan
      36
      38
    
    
      287
      202
      Tawana
      Nusbaum
      50
      45
    
    
      288
      289
      Jae
      Pak
      37
      38
    
    
      289
      288
      Rachel
      Valdez
      35
      37
    
    
      290
      999
      Chadwick
      Smith
      55
      47
    
  

115 rows × 5 columns

7d. If you did not use BETWEEN for 7c, do the same but use BETWEEN. If you did use BETWEEN for 7c, do it another way.

SELECT e.EmployeeID, e.Title, e.FirstName, e.LastName, e.VacationHours, e.SickLeaveHours FROM dbo.Employees AS e WHERE e.VacationHours > 20 AND e.VacationHours < 60 ;



In [114]:

    
Employees.loc[Employees.VacationHours.between(20, 60, inclusive=False)
              , ['EmployeeID', 'FirstName', 'LastName', 'VacationHours', 'SickLeaveHours']]









    Out[114]:






  
    
      
      EmployeeID
      FirstName
      LastName
      VacationHours
      SickLeaveHours
    
  
  
    
      0
      259
      Ben
      Miller
      55
      47
    
    
      1
      278
      Garrett
      Vargas
      33
      36
    
    
      2
      204
      Gabe
      Mares
      57
      48
    
    
      4
      255
      Gordon
      Hee
      52
      46
    
    
      5
      66
      Karan
      Khanna
      28
      34
    
    
      7
      22
      Sariya
      Harnpadoungsataya
      45
      42
    
    
      11
      248
      Mike
      Seamans
      59
      49
    
    
      12
      175
      Reed
      Koch
      23
      31
    
    
      18
      203
      Ken
      Myer
      51
      45
    
    
      19
      245
      Barbara
      Moreland
      58
      49
    
    
      20
      257
      Eric
      Kurjan
      54
      47
    
    
      22
      69
      Steve
      Masters
      31
      35
    
    
      26
      176
      David
      Lawrence
      29
      34
    
    
      27
      147
      Sandra
      Reátegui Alayo
      37
      38
    
    
      31
      86
      Ryan
      Cornelsen
      59
      49
    
    
      34
      72
      Steven
      Selikoff
      41
      40
    
    
      39
      75
      Michiko
      Osada
      44
      42
    
    
      43
      198
      Lorraine
      Nay
      49
      44
    
    
      46
      168
      Garrett
      Young
      34
      37
    
    
      47
      236
      Grant
      Culbertson
      53
      46
    
    
      49
      199
      Paula
      Nartker
      54
      47
    
    
      53
      41
      Bryan
      Baker
      35
      37
    
    
      54
      283
      David
      Campbell
      23
      31
    
    
      55
      226
      Brian
      LaMee
      48
      44
    
    
      61
      172
      Marc
      Ingle
      22
      31
    
    
      63
      276
      Linda
      Mitchell
      27
      33
    
    
      64
      225
      Alan
      Brewer
      47
      43
    
    
      68
      194
      Fred
      Northup
      47
      43
    
    
      72
      150
      Michael
      Vanderhyde
      35
      37
    
    
      73
      258
      Erin
      Hagens
      53
      46
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      228
      173
      Eugene
      Zabokritski
      33
      36
    
    
      230
      277
      Jillian
      Carson
      24
      32
    
    
      231
      35
      Brandon
      Heidepriem
      22
      31
    
    
      233
      223
      Sairaj
      Uddin
      46
      43
    
    
      235
      275
      Michael
      Blythe
      38
      39
    
    
      239
      200
      Frank
      Lee
      56
      48
    
    
      240
      279
      Tsvi
      Reiter
      29
      34
    
    
      241
      146
      Jian Shuo
      Wang
      36
      38
    
    
      243
      73
      Carole
      Poland
      42
      41
    
    
      247
      179
      Jan
      Miksovsky
      30
      35
    
    
      249
      21
      Terry
      Eminhizer
      44
      42
    
    
      252
      254
      Fukiko
      Ogisu
      57
      48
    
    
      253
      67
      Jay
      Adams
      32
      36
    
    
      254
      262
      David
      Barber
      56
      48
    
    
      256
      282
      José
      Saraiva
      31
      35
    
    
      257
      169
      Susan
      Metters
      31
      35
    
    
      260
      222
      A.  Scott
      Wright
      44
      42
    
    
      261
      70
      David
      Ortiz
      33
      36
    
    
      263
      148
      Jason
      Watters
      38
      39
    
    
      265
      290
      Ranjit
      Varkey Chudukatil
      34
      37
    
    
      271
      251
      Mikael
      Sandberg
      59
      49
    
    
      272
      241
      David
      Liu
      57
      48
    
    
      275
      239
      Mindy
      Martin
      51
      45
    
    
      278
      38
      Kim
      Abercrombie
      24
      32
    
    
      285
      18
      John
      Wood
      48
      44
    
    
      286
      46
      Eugene
      Kogan
      36
      38
    
    
      287
      202
      Tawana
      Nusbaum
      50
      45
    
    
      288
      289
      Jae
      Pak
      37
      38
    
    
      289
      288
      Rachel
      Valdez
      35
      37
    
    
      290
      999
      Chadwick
      Smith
      55
      47
    
  

115 rows × 5 columns

Grouping

1a. What is the earliest birthdate for all employees?

SELECT MIN(e.BirthDate) FROM dbo.Employees AS e;



In [115]:

    
Employees.BirthDate.min()









    Out[115]:





'1945-11-17'

1b. Add to the above, the most recent birthdate for all employees

SELECT MIN(e.BirthDate) AS 'Earliest Birthday', MAX(e.BirthDate) AS 'Most Reecent Birthday' FROM dbo.Employees AS e;



In [117]:

    
(Employees.BirthDate.min(), Employees.BirthDate.max())









    Out[117]:





('1945-11-17', '1985-07-01')

1c. Show the above results broken down by gender

SELECT e.Gender, MIN(e.BirthDate) AS 'Earliest Birthday', MAX(e.BirthDate) AS 'Most Reecent Birthday' FROM dbo.Employees AS e GROUP BY e.Gender ;



In [131]:

    
Employees.groupby(by=['Gender']).BirthDate.aggregate([min, max])

1d. Show the above results broken down by gender, and salaried/hourly

SELECT e.Gender, e.SalariedFlag, MIN(e.BirthDate) AS 'Earliest Birthday', MAX(e.BirthDate) AS 'Most Reecent Birthday' FROM dbo.Employees AS e GROUP BY e.Gender, e.SalariedFlag ;



In [134]:

    
Employees.groupby(by=['Gender', 'SalariedFlag']).BirthDate.aggregate([min, max])









    Out[134]:






  
    
      
      
      min
      max
    
    
      Gender
      SalariedFlag
      
      
    
  
  
    
      F
      0
      1948-05-25
      1985-07-01
    
    
      1
      1946-10-29
      1980-07-06
    
    
      M
      0
      1946-04-03
      1985-05-07
    
    
      1
      1945-11-17
      1982-04-14

2a. What are the average vacation hours for all employees?

SELECT AVG(e.VacationHours) FROM dbo.Employees AS e ;



In [135]:

    
Employees.VacationHours.mean()









    Out[135]:





50.628865979381445

2b. Add to the above, the minimum vacation hours for all employees

SELECT AVG(e.VacationHours) AS 'Average Vacation', MIN(e.VacationHours) AS 'Minumum Vacation' FROM dbo.Employees AS e ;



In [136]:

    
(Employees.VacationHours.mean(), Employees.VacationHours.min())









    Out[136]:





(50.628865979381445, 0)

2c. Show the above results broken down and ordered by job title

SELECT e.JobTitle, AVG(e.VacationHours) AS 'Average Vacation', MIN(e.VacationHours) AS 'Minimum Vacation' FROM dbo.Employees AS e GROUP BY e.JobTitle ;



In [157]:

    
Employees.groupby(by=['JobTitle']).VacationHours.aggregate([np.mean, np.min])









    Out[157]:






  
    
      
      mean
      amin
    
    
      JobTitle
      
      
    
  
  
    
      Accountant
      58.500000
      58
    
    
      Accounts Manager
      57.000000
      57
    
    
      Accounts Payable Specialist
      63.500000
      63
    
    
      Accounts Receivable Specialist
      61.000000
      60
    
    
      Application Specialist
      72.500000
      71
    
    
      Assistant to the Chief Financial Officer
      56.000000
      56
    
    
      BI Professor
      55.000000
      55
    
    
      Benefits Specialist
      51.000000
      51
    
    
      Buyer
      56.000000
      52
    
    
      Chief Executive Officer
      99.000000
      99
    
    
      Chief Financial Officer
      0.000000
      0
    
    
      Control Specialist
      75.500000
      75
    
    
      Database Administrator
      66.500000
      66
    
    
      Design Engineer
      5.000000
      4
    
    
      Document Control Assistant
      78.500000
      78
    
    
      Document Control Manager
      77.000000
      77
    
    
      Engineering Manager
      2.000000
      2
    
    
      European Sales Manager
      21.000000
      21
    
    
      Facilities Administrative Assistant
      87.000000
      87
    
    
      Facilities Manager
      86.000000
      86
    
    
      Finance Manager
      55.000000
      55
    
    
      Human Resources Administrative Assistant
      52.500000
      52
    
    
      Human Resources Manager
      54.000000
      54
    
    
      Information Services Manager
      65.000000
      65
    
    
      Janitor
      89.500000
      88
    
    
      Maintenance Supervisor
      92.000000
      92
    
    
      Marketing Assistant
      42.000000
      41
    
    
      Marketing Manager
      40.000000
      40
    
    
      Marketing Specialist
      46.000000
      44
    
    
      Master Scheduler
      44.000000
      44
    
    
      ...
      ...
      ...
    
    
      Production Supervisor - WC40
      72.000000
      71
    
    
      Production Supervisor - WC45
      75.000000
      74
    
    
      Production Supervisor - WC50
      78.000000
      77
    
    
      Production Supervisor - WC60
      81.000000
      80
    
    
      Production Technician - WC10
      91.000000
      83
    
    
      Production Technician - WC20
      10.500000
      0
    
    
      Production Technician - WC30
      34.000000
      22
    
    
      Production Technician - WC40
      59.500000
      47
    
    
      Production Technician - WC45
      80.000000
      73
    
    
      Production Technician - WC50
      46.653846
      0
    
    
      Production Technician - WC60
      26.500000
      14
    
    
      Purchasing Assistant
      50.500000
      50
    
    
      Purchasing Manager
      49.000000
      49
    
    
      Quality Assurance Manager
      80.000000
      80
    
    
      Quality Assurance Supervisor
      81.000000
      81
    
    
      Quality Assurance Technician
      83.500000
      82
    
    
      Recruiter
      49.500000
      49
    
    
      Research and Development Engineer
      62.500000
      62
    
    
      Research and Development Manager
      38.500000
      16
    
    
      Sales Representative
      31.000000
      22
    
    
      Scheduling Assistant
      46.500000
      45
    
    
      Senior Design Engineer
      3.000000
      3
    
    
      Senior Tool Designer
      27.500000
      7
    
    
      Shipping and Receiving Clerk
      94.500000
      94
    
    
      Shipping and Receiving Supervisor
      93.000000
      93
    
    
      Stocker
      97.000000
      96
    
    
      Tool Designer
      8.500000
      8
    
    
      Vice President of Engineering
      1.000000
      1
    
    
      Vice President of Production
      64.000000
      64
    
    
      Vice President of Sales
      10.000000
      10
    
  

68 rows × 2 columns

2d. Show the above results broken down by job title, and married/single employees

SELECT e.JobTitle, e.MaritalStatus, AVG(e.VacationHours) AS 'Average Vacation', MIN(e.VacationHours) AS 'Minimum Vacation' FROM dbo.Employees AS e GROUP BY e.JobTitle, e.MaritalStatus ;



In [158]:

    
Employees.groupby(by=['JobTitle', 'MaritalStatus']).VacationHours.aggregate([np.mean, np.min])









    Out[158]:






  
    
      
      
      mean
      amin
    
    
      JobTitle
      MaritalStatus
      
      
    
  
  
    
      Accountant
      M
      58.000000
      58
    
    
      S
      59.000000
      59
    
    
      Accounts Manager
      M
      57.000000
      57
    
    
      Accounts Payable Specialist
      M
      63.500000
      63
    
    
      Accounts Receivable Specialist
      M
      60.000000
      60
    
    
      S
      61.500000
      61
    
    
      Application Specialist
      M
      71.500000
      71
    
    
      S
      73.500000
      73
    
    
      Assistant to the Chief Financial Officer
      S
      56.000000
      56
    
    
      BI Professor
      M
      55.000000
      55
    
    
      Benefits Specialist
      M
      51.000000
      51
    
    
      Buyer
      M
      56.333333
      52
    
    
      S
      55.333333
      53
    
    
      Chief Executive Officer
      S
      99.000000
      99
    
    
      Chief Financial Officer
      M
      0.000000
      0
    
    
      Control Specialist
      M
      75.000000
      75
    
    
      S
      76.000000
      76
    
    
      Database Administrator
      M
      66.000000
      66
    
    
      S
      67.000000
      67
    
    
      Design Engineer
      M
      5.000000
      4
    
    
      Document Control Assistant
      M
      79.000000
      79
    
    
      S
      78.000000
      78
    
    
      Document Control Manager
      M
      77.000000
      77
    
    
      Engineering Manager
      M
      2.000000
      2
    
    
      European Sales Manager
      M
      21.000000
      21
    
    
      Facilities Administrative Assistant
      M
      87.000000
      87
    
    
      Facilities Manager
      M
      86.000000
      86
    
    
      Finance Manager
      S
      55.000000
      55
    
    
      Human Resources Administrative Assistant
      S
      52.500000
      52
    
    
      Human Resources Manager
      M
      54.000000
      54
    
    
      ...
      ...
      ...
      ...
    
    
      Production Technician - WC40
      S
      61.857143
      47
    
    
      Production Technician - WC45
      M
      80.833333
      76
    
    
      S
      79.444444
      73
    
    
      Production Technician - WC50
      M
      40.076923
      1
    
    
      S
      53.230769
      0
    
    
      Production Technician - WC60
      M
      24.714286
      14
    
    
      S
      28.583333
      16
    
    
      Purchasing Assistant
      M
      50.500000
      50
    
    
      Purchasing Manager
      S
      49.000000
      49
    
    
      Quality Assurance Manager
      S
      80.000000
      80
    
    
      Quality Assurance Supervisor
      M
      81.000000
      81
    
    
      Quality Assurance Technician
      M
      84.500000
      84
    
    
      S
      82.500000
      82
    
    
      Recruiter
      S
      49.500000
      49
    
    
      Research and Development Engineer
      M
      63.000000
      63
    
    
      S
      62.000000
      62
    
    
      Research and Development Manager
      M
      38.500000
      16
    
    
      Sales Representative
      M
      31.714286
      26
    
    
      S
      30.285714
      22
    
    
      Scheduling Assistant
      M
      46.500000
      45
    
    
      Senior Design Engineer
      S
      3.000000
      3
    
    
      Senior Tool Designer
      S
      27.500000
      7
    
    
      Shipping and Receiving Clerk
      M
      94.500000
      94
    
    
      Shipping and Receiving Supervisor
      S
      93.000000
      93
    
    
      Stocker
      M
      96.000000
      96
    
    
      S
      97.500000
      97
    
    
      Tool Designer
      M
      8.500000
      8
    
    
      Vice President of Engineering
      S
      1.000000
      1
    
    
      Vice President of Production
      S
      64.000000
      64
    
    
      Vice President of Sales
      S
      10.000000
      10
    
  

95 rows × 2 columns

2e. Add to the above, the maximum vacation hours per group

SELECT e.JobTitle, e.MaritalStatus , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e GROUP BY e.JobTitle, e.MaritalStatus ;



In [159]:

    
Employees.groupby(by=['JobTitle', 'MaritalStatus']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[159]:






  
    
      
      
      mean
      amin
      amax
    
    
      JobTitle
      MaritalStatus
      
      
      
    
  
  
    
      Accountant
      M
      58.000000
      58
      58
    
    
      S
      59.000000
      59
      59
    
    
      Accounts Manager
      M
      57.000000
      57
      57
    
    
      Accounts Payable Specialist
      M
      63.500000
      63
      64
    
    
      Accounts Receivable Specialist
      M
      60.000000
      60
      60
    
    
      S
      61.500000
      61
      62
    
    
      Application Specialist
      M
      71.500000
      71
      72
    
    
      S
      73.500000
      73
      74
    
    
      Assistant to the Chief Financial Officer
      S
      56.000000
      56
      56
    
    
      BI Professor
      M
      55.000000
      55
      55
    
    
      Benefits Specialist
      M
      51.000000
      51
      51
    
    
      Buyer
      M
      56.333333
      52
      60
    
    
      S
      55.333333
      53
      59
    
    
      Chief Executive Officer
      S
      99.000000
      99
      99
    
    
      Chief Financial Officer
      M
      0.000000
      0
      0
    
    
      Control Specialist
      M
      75.000000
      75
      75
    
    
      S
      76.000000
      76
      76
    
    
      Database Administrator
      M
      66.000000
      66
      66
    
    
      S
      67.000000
      67
      67
    
    
      Design Engineer
      M
      5.000000
      4
      6
    
    
      Document Control Assistant
      M
      79.000000
      79
      79
    
    
      S
      78.000000
      78
      78
    
    
      Document Control Manager
      M
      77.000000
      77
      77
    
    
      Engineering Manager
      M
      2.000000
      2
      2
    
    
      European Sales Manager
      M
      21.000000
      21
      21
    
    
      Facilities Administrative Assistant
      M
      87.000000
      87
      87
    
    
      Facilities Manager
      M
      86.000000
      86
      86
    
    
      Finance Manager
      S
      55.000000
      55
      55
    
    
      Human Resources Administrative Assistant
      S
      52.500000
      52
      53
    
    
      Human Resources Manager
      M
      54.000000
      54
      54
    
    
      ...
      ...
      ...
      ...
      ...
    
    
      Production Technician - WC40
      S
      61.857143
      47
      72
    
    
      Production Technician - WC45
      M
      80.833333
      76
      87
    
    
      S
      79.444444
      73
      85
    
    
      Production Technician - WC50
      M
      40.076923
      1
      99
    
    
      S
      53.230769
      0
      98
    
    
      Production Technician - WC60
      M
      24.714286
      14
      39
    
    
      S
      28.583333
      16
      38
    
    
      Purchasing Assistant
      M
      50.500000
      50
      51
    
    
      Purchasing Manager
      S
      49.000000
      49
      49
    
    
      Quality Assurance Manager
      S
      80.000000
      80
      80
    
    
      Quality Assurance Supervisor
      M
      81.000000
      81
      81
    
    
      Quality Assurance Technician
      M
      84.500000
      84
      85
    
    
      S
      82.500000
      82
      83
    
    
      Recruiter
      S
      49.500000
      49
      50
    
    
      Research and Development Engineer
      M
      63.000000
      63
      63
    
    
      S
      62.000000
      62
      62
    
    
      Research and Development Manager
      M
      38.500000
      16
      61
    
    
      Sales Representative
      M
      31.714286
      26
      39
    
    
      S
      30.285714
      22
      38
    
    
      Scheduling Assistant
      M
      46.500000
      45
      48
    
    
      Senior Design Engineer
      S
      3.000000
      3
      3
    
    
      Senior Tool Designer
      S
      27.500000
      7
      48
    
    
      Shipping and Receiving Clerk
      M
      94.500000
      94
      95
    
    
      Shipping and Receiving Supervisor
      S
      93.000000
      93
      93
    
    
      Stocker
      M
      96.000000
      96
      96
    
    
      S
      97.500000
      97
      98
    
    
      Tool Designer
      M
      8.500000
      8
      9
    
    
      Vice President of Engineering
      S
      1.000000
      1
      1
    
    
      Vice President of Production
      S
      64.000000
      64
      64
    
    
      Vice President of Sales
      S
      10.000000
      10
      10
    
  

95 rows × 3 columns

2f. Show the above results broken down by job title, married/single employees, and State

SELECT e.JobTitle, e.MaritalStatus, e.StateProvinceName , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e GROUP BY e.JobTitle, e.MaritalStatus, e.StateProvinceName ;



In [160]:

    
Employees.groupby(by=['JobTitle', 'MaritalStatus', 'StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[160]:






  
    
      
      
      
      mean
      amin
      amax
    
    
      JobTitle
      MaritalStatus
      StateProvinceName
      
      
      
    
  
  
    
      Accountant
      M
      Washington
      58.000000
      58
      58
    
    
      S
      Washington
      59.000000
      59
      59
    
    
      Accounts Manager
      M
      Washington
      57.000000
      57
      57
    
    
      Accounts Payable Specialist
      M
      Washington
      63.500000
      63
      64
    
    
      Accounts Receivable Specialist
      M
      Washington
      60.000000
      60
      60
    
    
      S
      Alberta
      62.000000
      62
      62
    
    
      Washington
      61.000000
      61
      61
    
    
      Application Specialist
      M
      Washington
      71.500000
      71
      72
    
    
      S
      Washington
      73.500000
      73
      74
    
    
      Assistant to the Chief Financial Officer
      S
      Washington
      56.000000
      56
      56
    
    
      BI Professor
      M
      Georgia
      55.000000
      55
      55
    
    
      Benefits Specialist
      M
      Washington
      51.000000
      51
      51
    
    
      Buyer
      M
      Washington
      56.333333
      52
      60
    
    
      S
      Washington
      55.333333
      53
      59
    
    
      Chief Executive Officer
      S
      Washington
      99.000000
      99
      99
    
    
      Chief Financial Officer
      M
      Washington
      0.000000
      0
      0
    
    
      Control Specialist
      M
      Washington
      75.000000
      75
      75
    
    
      S
      Washington
      76.000000
      76
      76
    
    
      Database Administrator
      M
      Washington
      66.000000
      66
      66
    
    
      S
      Washington
      67.000000
      67
      67
    
    
      Design Engineer
      M
      Washington
      5.000000
      4
      6
    
    
      Document Control Assistant
      M
      Washington
      79.000000
      79
      79
    
    
      S
      Washington
      78.000000
      78
      78
    
    
      Document Control Manager
      M
      Washington
      77.000000
      77
      77
    
    
      Engineering Manager
      M
      Washington
      2.000000
      2
      2
    
    
      European Sales Manager
      M
      Washington
      21.000000
      21
      21
    
    
      Facilities Administrative Assistant
      M
      Washington
      87.000000
      87
      87
    
    
      Facilities Manager
      M
      Washington
      86.000000
      86
      86
    
    
      Finance Manager
      S
      Washington
      55.000000
      55
      55
    
    
      Human Resources Administrative Assistant
      S
      Washington
      52.500000
      52
      53
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      Research and Development Engineer
      M
      Washington
      63.000000
      63
      63
    
    
      S
      Washington
      62.000000
      62
      62
    
    
      Research and Development Manager
      M
      California
      16.000000
      16
      16
    
    
      Washington
      61.000000
      61
      61
    
    
      Sales Representative
      M
      Alberta
      33.000000
      33
      33
    
    
      California
      26.000000
      26
      26
    
    
      England
      37.000000
      37
      37
    
    
      Massachusetts
      39.000000
      39
      39
    
    
      Ontario
      31.000000
      31
      31
    
    
      Tennessee
      29.000000
      29
      29
    
    
      Utah
      27.000000
      27
      27
    
    
      S
      Gironde
      34.000000
      34
      34
    
    
      Hamburg
      35.000000
      35
      35
    
    
      Michigan
      38.000000
      38
      38
    
    
      Minnesota
      24.000000
      24
      24
    
    
      Oregon
      22.000000
      22
      22
    
    
      Victoria
      36.000000
      36
      36
    
    
      Washington
      23.000000
      23
      23
    
    
      Scheduling Assistant
      M
      Washington
      46.500000
      45
      48
    
    
      Senior Design Engineer
      S
      Washington
      3.000000
      3
      3
    
    
      Senior Tool Designer
      S
      Minnesota
      48.000000
      48
      48
    
    
      Washington
      7.000000
      7
      7
    
    
      Shipping and Receiving Clerk
      M
      Washington
      94.500000
      94
      95
    
    
      Shipping and Receiving Supervisor
      S
      Washington
      93.000000
      93
      93
    
    
      Stocker
      M
      Washington
      96.000000
      96
      96
    
    
      S
      Washington
      97.500000
      97
      98
    
    
      Tool Designer
      M
      Washington
      8.500000
      8
      9
    
    
      Vice President of Engineering
      S
      Washington
      1.000000
      1
      1
    
    
      Vice President of Production
      S
      Washington
      64.000000
      64
      64
    
    
      Vice President of Sales
      S
      Washington
      10.000000
      10
      10
    
  

113 rows × 3 columns

2g. Show the above results but only for American employees

SELECT e.JobTitle, e.MaritalStatus, e.StateProvinceName , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e WHERE e.CountryName = 'United States' GROUP BY e.JobTitle, e.MaritalStatus, e.StateProvinceName ;



In [164]:

    
Employees[Employees.CountryName=='United States'] \
.groupby(by=['JobTitle', 'MaritalStatus', 'StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[164]:






  
    
      
      
      
      mean
      amin
      amax
    
    
      JobTitle
      MaritalStatus
      StateProvinceName
      
      
      
    
  
  
    
      Accountant
      M
      Washington
      58.000000
      58
      58
    
    
      S
      Washington
      59.000000
      59
      59
    
    
      Accounts Manager
      M
      Washington
      57.000000
      57
      57
    
    
      Accounts Payable Specialist
      M
      Washington
      63.500000
      63
      64
    
    
      Accounts Receivable Specialist
      M
      Washington
      60.000000
      60
      60
    
    
      S
      Washington
      61.000000
      61
      61
    
    
      Application Specialist
      M
      Washington
      71.500000
      71
      72
    
    
      S
      Washington
      73.500000
      73
      74
    
    
      Assistant to the Chief Financial Officer
      S
      Washington
      56.000000
      56
      56
    
    
      BI Professor
      M
      Georgia
      55.000000
      55
      55
    
    
      Benefits Specialist
      M
      Washington
      51.000000
      51
      51
    
    
      Buyer
      M
      Washington
      56.333333
      52
      60
    
    
      S
      Washington
      55.333333
      53
      59
    
    
      Chief Executive Officer
      S
      Washington
      99.000000
      99
      99
    
    
      Chief Financial Officer
      M
      Washington
      0.000000
      0
      0
    
    
      Control Specialist
      M
      Washington
      75.000000
      75
      75
    
    
      S
      Washington
      76.000000
      76
      76
    
    
      Database Administrator
      M
      Washington
      66.000000
      66
      66
    
    
      S
      Washington
      67.000000
      67
      67
    
    
      Design Engineer
      M
      Washington
      5.000000
      4
      6
    
    
      Document Control Assistant
      M
      Washington
      79.000000
      79
      79
    
    
      S
      Washington
      78.000000
      78
      78
    
    
      Document Control Manager
      M
      Washington
      77.000000
      77
      77
    
    
      Engineering Manager
      M
      Washington
      2.000000
      2
      2
    
    
      European Sales Manager
      M
      Washington
      21.000000
      21
      21
    
    
      Facilities Administrative Assistant
      M
      Washington
      87.000000
      87
      87
    
    
      Facilities Manager
      M
      Washington
      86.000000
      86
      86
    
    
      Finance Manager
      S
      Washington
      55.000000
      55
      55
    
    
      Human Resources Administrative Assistant
      S
      Washington
      52.500000
      52
      53
    
    
      Human Resources Manager
      M
      Washington
      54.000000
      54
      54
    
    
      ...
      ...
      ...
      ...
      ...
      ...
    
    
      Purchasing Manager
      S
      Washington
      49.000000
      49
      49
    
    
      Quality Assurance Manager
      S
      Washington
      80.000000
      80
      80
    
    
      Quality Assurance Supervisor
      M
      Washington
      81.000000
      81
      81
    
    
      Quality Assurance Technician
      M
      Washington
      84.500000
      84
      85
    
    
      S
      Washington
      82.500000
      82
      83
    
    
      Recruiter
      S
      Washington
      49.500000
      49
      50
    
    
      Research and Development Engineer
      M
      Washington
      63.000000
      63
      63
    
    
      S
      Washington
      62.000000
      62
      62
    
    
      Research and Development Manager
      M
      California
      16.000000
      16
      16
    
    
      Washington
      61.000000
      61
      61
    
    
      Sales Representative
      M
      California
      26.000000
      26
      26
    
    
      Massachusetts
      39.000000
      39
      39
    
    
      Tennessee
      29.000000
      29
      29
    
    
      Utah
      27.000000
      27
      27
    
    
      S
      Michigan
      38.000000
      38
      38
    
    
      Minnesota
      24.000000
      24
      24
    
    
      Oregon
      22.000000
      22
      22
    
    
      Washington
      23.000000
      23
      23
    
    
      Scheduling Assistant
      M
      Washington
      46.500000
      45
      48
    
    
      Senior Design Engineer
      S
      Washington
      3.000000
      3
      3
    
    
      Senior Tool Designer
      S
      Minnesota
      48.000000
      48
      48
    
    
      Washington
      7.000000
      7
      7
    
    
      Shipping and Receiving Clerk
      M
      Washington
      94.500000
      94
      95
    
    
      Shipping and Receiving Supervisor
      S
      Washington
      93.000000
      93
      93
    
    
      Stocker
      M
      Washington
      96.000000
      96
      96
    
    
      S
      Washington
      97.500000
      97
      98
    
    
      Tool Designer
      M
      Washington
      8.500000
      8
      9
    
    
      Vice President of Engineering
      S
      Washington
      1.000000
      1
      1
    
    
      Vice President of Production
      S
      Washington
      64.000000
      64
      64
    
    
      Vice President of Sales
      S
      Washington
      10.000000
      10
      10
    
  

103 rows × 3 columns

2h. Change the grouping above so it's broken down by married/single and State, no more job title

SELECT e.MaritalStatus, e.StateProvinceName , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e WHERE e.CountryName = 'United States' GROUP BY e.MaritalStatus, e.StateProvinceName ;



In [165]:

    
Employees[Employees.CountryName=='United States'] \
.groupby(by=['MaritalStatus', 'StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[165]:






  
    
      
      
      mean
      amin
      amax
    
    
      MaritalStatus
      StateProvinceName
      
      
      
    
  
  
    
      M
      California
      21.000000
      16
      26
    
    
      Georgia
      55.000000
      55
      55
    
    
      Massachusetts
      39.000000
      39
      39
    
    
      Tennessee
      29.000000
      29
      29
    
    
      Utah
      27.000000
      27
      27
    
    
      Washington
      49.426471
      0
      99
    
    
      S
      Michigan
      38.000000
      38
      38
    
    
      Minnesota
      36.000000
      24
      48
    
    
      Oregon
      22.000000
      22
      22
    
    
      Washington
      54.362963
      0
      99

2i. Limit the results above to States where the average vacation hours is greater than 30

SELECT e.StateProvinceName , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e WHERE e.CountryName = 'United States' GROUP BY e.StateProvinceName HAVING AVG(e.VacationHours) > 30 ;



In [180]:

    
Employees[Employees.CountryName=='United States'] \
.groupby(by=['StateProvinceName']).filter(lambda x: x.VacationHours.mean()>30) \
.groupby(by=['StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[180]:






  
    
      
      mean
      amin
      amax
    
    
      StateProvinceName
      
      
      
    
  
  
    
      Georgia
      55.000000
      55
      55
    
    
      Massachusetts
      39.000000
      39
      39
    
    
      Michigan
      38.000000
      38
      38
    
    
      Minnesota
      36.000000
      24
      48
    
    
      Washington
      51.885609
      0
      99

2j. Limit the results above to States where the average vacation hours is greater than 30 and the maximum vacation hours is less than 50



In [182]:

    
Employees[Employees.CountryName=='United States'] \
.groupby(by=['StateProvinceName']).filter(lambda x: (x.VacationHours.mean() >30) and (x.VacationHours.max()<50)) \
.groupby(by=['StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[182]:






  
    
      
      mean
      amin
      amax
    
    
      StateProvinceName
      
      
      
    
  
  
    
      Massachusetts
      39
      39
      39
    
    
      Michigan
      38
      38
      38
    
    
      Minnesota
      36
      24
      48

2k. Show the same results but only for non-American employees

SELECT e.StateProvinceName , AVG(e.VacationHours) AS 'Average Vacation' , MIN(e.VacationHours) AS 'Minimum Vacation' , MAX(e.VacationHours) AS 'Maximum Vacation' FROM dbo.Employees AS e WHERE e.CountryName != 'United States' GROUP BY e.StateProvinceName HAVING AVG(e.VacationHours) BETWEEN 31 AND 49 ;



In [183]:

    
Employees[Employees.CountryName!='United States'] \
.groupby(by=['StateProvinceName']).filter(lambda x: (x.VacationHours.mean() >30) and (x.VacationHours.max()<50)) \
.groupby(by=['StateProvinceName']).VacationHours.aggregate([np.mean, np.min, np.max])









    Out[183]:






  
    
      
      mean
      amin
      amax
    
    
      StateProvinceName
      
      
      
    
  
  
    
      England
      37
      37
      37
    
    
      Gironde
      34
      34
      34
    
    
      Hamburg
      35
      35
      35
    
    
      Ontario
      31
      31
      31
    
    
      Victoria
      36
      36
      36

3a. Report how many employees are in the company

SELECT COUNT(1) FROM dbo.Employees;



In [185]:

    
Employees.shape[0]









    Out[185]:





291

3b. For the above report, show the number of employees per manager (hint: use ManagerID)

SELECT e.ManagerID, COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e GROUP BY e.ManagerID ;



In [186]:

    
Employees.groupby(by=['ManagerID']).EmployeeID.aggregate(lambda x: x.shape[0])









    Out[186]:





ManagerID
1.0        8
16.0       8
25.0      30
26.0     178
227.0      6
234.0      3
235.0      5
249.0      9
250.0     17
263.0      9
273.0      3
274.0     10
285.0      1
287.0      3
Name: EmployeeID, dtype: int64

3c. Remove any manager ID's that are NULL from the results above

SELECT e.ManagerID, COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e WHERE e.ManagerID IS NOT NULL GROUP BY e.ManagerID ;



In [193]:

    
Employees.loc[pd.isnull(Employees.EmployeeID)==False] \
.groupby(by=['ManagerID']).EmployeeID.aggregate(lambda x: x.shape[0])









    Out[193]:





ManagerID
1.0        8
16.0       8
25.0      30
26.0     178
227.0      6
234.0      3
235.0      5
249.0      9
250.0     17
263.0      9
273.0      3
274.0     10
285.0      1
287.0      3
Name: EmployeeID, dtype: int64

3d. Show the same results as above, but only for managers who have at least 5 employees

SELECT e.ManagerID, COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e WHERE e.ManagerID IS NOT NULL GROUP BY e.ManagerID HAVING COUNT(1) >= 5 ;



In [195]:

    
Employees.loc[pd.isnull(Employees.EmployeeID)==False] \
.groupby(by=['ManagerID']).filter(lambda x: x.shape[0]>=5) \
.groupby(by=['ManagerID']).EmployeeID.aggregate(lambda x: x.shape[0])









    Out[195]:





ManagerID
1.0        8
16.0       8
25.0      30
26.0     178
227.0      6
235.0      5
249.0      9
250.0     17
263.0      9
274.0     10
Name: EmployeeID, dtype: int64

4a. List the average vacation hours of all employees

SELECT AVG(e.VacationHours) FROM dbo.Employees AS e ;



In [196]:

    
Employees.VacationHours.mean()









    Out[196]:





50.628865979381445

4b. Break down the results by State

SELECT e.StateProvinceName, AVG(e.VacationHours) AS 'Average Vacation Hours' FROM dbo.Employees AS e GROUP BY e.StateProvinceName ;



In [197]:

    
Employees.groupby(by=['StateProvinceName']).VacationHours.mean()









    Out[197]:





StateProvinceName
Alberta          35.000000
California       21.000000
England          37.000000
Georgia          55.000000
Gironde          34.000000
Hamburg          35.000000
Massachusetts    39.000000
Michigan         38.000000
Minnesota        36.000000
Ontario          31.000000
Oregon           22.000000
Tennessee        29.000000
Utah             27.000000
Victoria         36.000000
Washington       51.885609
Name: VacationHours, dtype: float64

4c. Break down the results by city and State

SELECT e.StateProvinceName, e.City, AVG(e.VacationHours) AS 'Average Vacation Hours' FROM dbo.Employees AS e GROUP BY e.StateProvinceName, e.City ;



In [198]:

    
Employees.groupby(by=['StateProvinceName', 'City']).VacationHours.mean()









    Out[198]:





StateProvinceName  City         
Alberta            Calgary          35.000000
California         San Francisco    21.000000
England            Cambridge        37.000000
Georgia            Atlanta          55.000000
Gironde            Bordeaux         34.000000
Hamburg            Berlin           35.000000
Massachusetts      Cambridge        39.000000
Michigan           Detroit          38.000000
Minnesota          Duluth           24.000000
                   Minneapolis      48.000000
Ontario            Ottawa           31.000000
Oregon             Portland         22.000000
Tennessee          Memphis          29.000000
Utah               Nevada           27.000000
Victoria           Melbourne        36.000000
Washington         Bellevue         44.305556
                   Bothell          65.692308
                   Carnation        59.000000
                   Duvall           42.200000
                   Edmonds          47.640000
                   Everett          57.823529
                   Gold Bar         70.800000
                   Index            76.000000
                   Issaquah         67.666667
                   Kenmore          53.916667
                   Kent              5.000000
                   Monroe           47.916667
                   Newport Hills    62.833333
                   Redmond          37.428571
                   Renton           54.176471
                   Sammamish        42.823529
                   Seattle          54.659091
                   Snohomish        52.800000
Name: VacationHours, dtype: float64

4d. Add something that shows the number of employees per city

SELECT e.StateProvinceName, e.City , AVG(e.VacationHours) AS 'Average Vacation Hours' , COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e GROUP BY e.StateProvinceName, e.City ;



In [17]:

    
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean'
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    }
}
Employees.groupby(by=['StateProvinceName', 'City'])\
.aggregate(aggregations)









    Out[17]:






  
    
      
      
      VacationHours
      EmployeeID
    
    
      
      
      AverageVacationHours
      NumEmployees
    
    
      StateProvinceName
      City
      
      
    
  
  
    
      Alberta
      Calgary
      35.000000
      5
    
    
      California
      San Francisco
      21.000000
      2
    
    
      England
      Cambridge
      37.000000
      1
    
    
      Georgia
      Atlanta
      55.000000
      1
    
    
      Gironde
      Bordeaux
      34.000000
      1
    
    
      Hamburg
      Berlin
      35.000000
      1
    
    
      Massachusetts
      Cambridge
      39.000000
      1
    
    
      Michigan
      Detroit
      38.000000
      1
    
    
      Minnesota
      Duluth
      24.000000
      1
    
    
      Minneapolis
      48.000000
      1
    
    
      Ontario
      Ottawa
      31.000000
      1
    
    
      Oregon
      Portland
      22.000000
      1
    
    
      Tennessee
      Memphis
      29.000000
      1
    
    
      Utah
      Nevada
      27.000000
      1
    
    
      Victoria
      Melbourne
      36.000000
      1
    
    
      Washington
      Bellevue
      44.305556
      36
    
    
      Bothell
      65.692308
      13
    
    
      Carnation
      59.000000
      5
    
    
      Duvall
      42.200000
      10
    
    
      Edmonds
      47.640000
      25
    
    
      Everett
      57.823529
      17
    
    
      Gold Bar
      70.800000
      5
    
    
      Index
      76.000000
      5
    
    
      Issaquah
      67.666667
      15
    
    
      Kenmore
      53.916667
      12
    
    
      Kent
      5.000000
      1
    
    
      Monroe
      47.916667
      12
    
    
      Newport Hills
      62.833333
      6
    
    
      Redmond
      37.428571
      21
    
    
      Renton
      54.176471
      17
    
    
      Sammamish
      42.823529
      17
    
    
      Seattle
      54.659091
      44
    
    
      Snohomish
      52.800000
      10

4e. Sort the results by the city and state

SELECT e.StateProvinceName, e.City , AVG(e.VacationHours) AS 'Average Vacation Hours' , COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e GROUP BY e.StateProvinceName, e.City ORDER BY e.StateProvinceName, e.City ;



In [217]:

    
### Since Pandas already sorts the results, we show here how show resutls in DESCENDING order
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean'
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    }
}
Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations) \
.sort_index(ascending=False, axis=0)









    Out[217]:






  
    
      
      
      EmployeeID
      VacationHours
    
    
      
      
      NumEmployees
      AverageVacationHours
    
    
      StateProvinceName
      City
      
      
    
  
  
    
      Washington
      Snohomish
      10
      52.800000
    
    
      Seattle
      44
      54.659091
    
    
      Sammamish
      17
      42.823529
    
    
      Renton
      17
      54.176471
    
    
      Redmond
      21
      37.428571
    
    
      Newport Hills
      6
      62.833333
    
    
      Monroe
      12
      47.916667
    
    
      Kent
      1
      5.000000
    
    
      Kenmore
      12
      53.916667
    
    
      Issaquah
      15
      67.666667
    
    
      Index
      5
      76.000000
    
    
      Gold Bar
      5
      70.800000
    
    
      Everett
      17
      57.823529
    
    
      Edmonds
      25
      47.640000
    
    
      Duvall
      10
      42.200000
    
    
      Carnation
      5
      59.000000
    
    
      Bothell
      13
      65.692308
    
    
      Bellevue
      36
      44.305556
    
    
      Victoria
      Melbourne
      1
      36.000000
    
    
      Utah
      Nevada
      1
      27.000000
    
    
      Tennessee
      Memphis
      1
      29.000000
    
    
      Oregon
      Portland
      1
      22.000000
    
    
      Ontario
      Ottawa
      1
      31.000000
    
    
      Minnesota
      Minneapolis
      1
      48.000000
    
    
      Duluth
      1
      24.000000
    
    
      Michigan
      Detroit
      1
      38.000000
    
    
      Massachusetts
      Cambridge
      1
      39.000000
    
    
      Hamburg
      Berlin
      1
      35.000000
    
    
      Gironde
      Bordeaux
      1
      34.000000
    
    
      Georgia
      Atlanta
      1
      55.000000
    
    
      England
      Cambridge
      1
      37.000000
    
    
      California
      San Francisco
      2
      21.000000
    
    
      Alberta
      Calgary
      5
      35.000000



In [241]:

    
## This adds the row indices back as columns
df = Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations) \
.sort_index(ascending=False, axis=0)

df['SPN'] = df.index.get_level_values(0)
df['C'] = df.index.get_level_values(1)









    Out[241]:






  
    
      
      
      employeeid
      vacationhours
      spn
      c
    
    
      
      
      numemployees
      averagevacationhours
      
      
    
    
      StateProvinceName
      City
      
      
      
      
    
  
  
    
      Washington
      Snohomish
      10
      52.800000
      Washington
      Snohomish
    
    
      Seattle
      44
      54.659091
      Washington
      Seattle
    
    
      Sammamish
      17
      42.823529
      Washington
      Sammamish
    
    
      Renton
      17
      54.176471
      Washington
      Renton
    
    
      Redmond
      21
      37.428571
      Washington
      Redmond
    
    
      Newport Hills
      6
      62.833333
      Washington
      Newport Hills
    
    
      Monroe
      12
      47.916667
      Washington
      Monroe
    
    
      Kent
      1
      5.000000
      Washington
      Kent
    
    
      Kenmore
      12
      53.916667
      Washington
      Kenmore
    
    
      Issaquah
      15
      67.666667
      Washington
      Issaquah
    
    
      Index
      5
      76.000000
      Washington
      Index
    
    
      Gold Bar
      5
      70.800000
      Washington
      Gold Bar
    
    
      Everett
      17
      57.823529
      Washington
      Everett
    
    
      Edmonds
      25
      47.640000
      Washington
      Edmonds
    
    
      Duvall
      10
      42.200000
      Washington
      Duvall
    
    
      Carnation
      5
      59.000000
      Washington
      Carnation
    
    
      Bothell
      13
      65.692308
      Washington
      Bothell
    
    
      Bellevue
      36
      44.305556
      Washington
      Bellevue
    
    
      Victoria
      Melbourne
      1
      36.000000
      Victoria
      Melbourne
    
    
      Utah
      Nevada
      1
      27.000000
      Utah
      Nevada
    
    
      Tennessee
      Memphis
      1
      29.000000
      Tennessee
      Memphis
    
    
      Oregon
      Portland
      1
      22.000000
      Oregon
      Portland
    
    
      Ontario
      Ottawa
      1
      31.000000
      Ontario
      Ottawa
    
    
      Minnesota
      Minneapolis
      1
      48.000000
      Minnesota
      Minneapolis
    
    
      Duluth
      1
      24.000000
      Minnesota
      Duluth
    
    
      Michigan
      Detroit
      1
      38.000000
      Michigan
      Detroit
    
    
      Massachusetts
      Cambridge
      1
      39.000000
      Massachusetts
      Cambridge
    
    
      Hamburg
      Berlin
      1
      35.000000
      Hamburg
      Berlin
    
    
      Gironde
      Bordeaux
      1
      34.000000
      Gironde
      Bordeaux
    
    
      Georgia
      Atlanta
      1
      55.000000
      Georgia
      Atlanta
    
    
      England
      Cambridge
      1
      37.000000
      England
      Cambridge
    
    
      California
      San Francisco
      2
      21.000000
      California
      San Francisco
    
    
      Alberta
      Calgary
      5
      35.000000
      Alberta
      Calgary

4f. Make city and State a single column in the format of "City, State"

SELECT CONCAT(e.StateProvinceName, ', ', e.City) AS 'Location' , AVG(e.VacationHours) AS 'Average Vacation Hours' , COUNT(1) AS 'NumEmployees' FROM dbo.Employees AS e GROUP BY e.StateProvinceName, e.City ORDER BY e.StateProvinceName, e.City ;



In [362]:

    
# Version 1: first create new column 'CityState' ... rest is easy
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean'
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

df = Employees.copy()
df['CityState'] = Employees.City.str.cat(Employees.StateProvinceName, sep=', ')
df.groupby(by = 'CityState').aggregate(aggregations).sort_index(ascending=False, axis=0)









    Out[362]:






  
    
      
      EmployeeID
      VacationHours
    
    
      
      NumEmployees
      AverageVacationHours
    
    
      CityState
      
      
    
  
  
    
      Snohomish, Washington
      10
      52.800000
    
    
      Seattle, Washington
      44
      54.659091
    
    
      San Francisco, California
      2
      21.000000
    
    
      Sammamish, Washington
      17
      42.823529
    
    
      Renton, Washington
      17
      54.176471
    
    
      Redmond, Washington
      21
      37.428571
    
    
      Portland, Oregon
      1
      22.000000
    
    
      Ottawa, Ontario
      1
      31.000000
    
    
      Newport Hills, Washington
      6
      62.833333
    
    
      Nevada, Utah
      1
      27.000000
    
    
      Monroe, Washington
      12
      47.916667
    
    
      Minneapolis, Minnesota
      1
      48.000000
    
    
      Memphis, Tennessee
      1
      29.000000
    
    
      Melbourne, Victoria
      1
      36.000000
    
    
      Kent, Washington
      1
      5.000000
    
    
      Kenmore, Washington
      12
      53.916667
    
    
      Issaquah, Washington
      15
      67.666667
    
    
      Index, Washington
      5
      76.000000
    
    
      Gold Bar, Washington
      5
      70.800000
    
    
      Everett, Washington
      17
      57.823529
    
    
      Edmonds, Washington
      25
      47.640000
    
    
      Duvall, Washington
      10
      42.200000
    
    
      Duluth, Minnesota
      1
      24.000000
    
    
      Detroit, Michigan
      1
      38.000000
    
    
      Carnation, Washington
      5
      59.000000
    
    
      Cambridge, Massachusetts
      1
      39.000000
    
    
      Cambridge, England
      1
      37.000000
    
    
      Calgary, Alberta
      5
      35.000000
    
    
      Bothell, Washington
      13
      65.692308
    
    
      Bordeaux, Gironde
      1
      34.000000
    
    
      Berlin, Hamburg
      1
      35.000000
    
    
      Bellevue, Washington
      36
      44.305556
    
    
      Atlanta, Georgia
      1
      55.000000



In [301]:

    
# Version 2: using the multi-indexed result and adding index values as new column 'CityState'
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean'
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

df = Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations) \
.sort_index(ascending=False, axis=0)
df['CityState'] = [
    r.City+', '+r.State \
        for i, r in pd.DataFrame({
            'City': df.index.get_level_values(1),
            'State': df.index.get_level_values(0)
            }).iterrows()
    ]
df









    Out[301]:






  
    
      
      
      EmployeeID
      VacationHours
      CityState
    
    
      
      
      NumEmployees
      AverageVacationHours
      
    
    
      StateProvinceName
      City
      
      
      
    
  
  
    
      Washington
      Snohomish
      10
      52.800000
      Snohomish, Washington
    
    
      Seattle
      44
      54.659091
      Seattle, Washington
    
    
      Sammamish
      17
      42.823529
      Sammamish, Washington
    
    
      Renton
      17
      54.176471
      Renton, Washington
    
    
      Redmond
      21
      37.428571
      Redmond, Washington
    
    
      Newport Hills
      6
      62.833333
      Newport Hills, Washington
    
    
      Monroe
      12
      47.916667
      Monroe, Washington
    
    
      Kent
      1
      5.000000
      Kent, Washington
    
    
      Kenmore
      12
      53.916667
      Kenmore, Washington
    
    
      Issaquah
      15
      67.666667
      Issaquah, Washington
    
    
      Index
      5
      76.000000
      Index, Washington
    
    
      Gold Bar
      5
      70.800000
      Gold Bar, Washington
    
    
      Everett
      17
      57.823529
      Everett, Washington
    
    
      Edmonds
      25
      47.640000
      Edmonds, Washington
    
    
      Duvall
      10
      42.200000
      Duvall, Washington
    
    
      Carnation
      5
      59.000000
      Carnation, Washington
    
    
      Bothell
      13
      65.692308
      Bothell, Washington
    
    
      Bellevue
      36
      44.305556
      Bellevue, Washington
    
    
      Victoria
      Melbourne
      1
      36.000000
      Melbourne, Victoria
    
    
      Utah
      Nevada
      1
      27.000000
      Nevada, Utah
    
    
      Tennessee
      Memphis
      1
      29.000000
      Memphis, Tennessee
    
    
      Oregon
      Portland
      1
      22.000000
      Portland, Oregon
    
    
      Ontario
      Ottawa
      1
      31.000000
      Ottawa, Ontario
    
    
      Minnesota
      Minneapolis
      1
      48.000000
      Minneapolis, Minnesota
    
    
      Duluth
      1
      24.000000
      Duluth, Minnesota
    
    
      Michigan
      Detroit
      1
      38.000000
      Detroit, Michigan
    
    
      Massachusetts
      Cambridge
      1
      39.000000
      Cambridge, Massachusetts
    
    
      Hamburg
      Berlin
      1
      35.000000
      Berlin, Hamburg
    
    
      Gironde
      Bordeaux
      1
      34.000000
      Bordeaux, Gironde
    
    
      Georgia
      Atlanta
      1
      55.000000
      Atlanta, Georgia
    
    
      England
      Cambridge
      1
      37.000000
      Cambridge, England
    
    
      California
      San Francisco
      2
      21.000000
      San Francisco, California
    
    
      Alberta
      Calgary
      5
      35.000000
      Calgary, Alberta

4g. Add a column that shows the difference between the maximum vacation hours and minimum vacation hours for each city

SELECT CONCAT(e.StateProvinceName, ', ', e.City) AS 'Location' , AVG(e.VacationHours) AS 'Average Vacation Hours' , COUNT(1) AS 'NumEmployees' , MAX(e.VacationHours)-MIN(e.VacationHours) AS 'Difference Vacation Hours' FROM dbo.Employees AS e GROUP BY e.StateProvinceName, e.City ORDER BY e.StateProvinceName, e.City ;



In [309]:

    
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean',
        'MinVacationHours' : min, 
        'MaxVacationHours' : max,
        'DeltaVacationHoues' : lambda x: x.max()-x.min()
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations) \
.sort_index(ascending=False, axis=0)









    Out[309]:






  
    
      
      
      EmployeeID
      VacationHours
    
    
      
      
      NumEmployees
      AverageVacationHours
      DeltaVacationHoues
      MinVacationHours
      MaxVacationHours
    
    
      StateProvinceName
      City
      
      
      
      
      
    
  
  
    
      Washington
      Snohomish
      10
      52.800000
      85
      10
      95
    
    
      Seattle
      44
      54.659091
      89
      6
      95
    
    
      Sammamish
      17
      42.823529
      93
      0
      93
    
    
      Renton
      17
      54.176471
      97
      0
      97
    
    
      Redmond
      21
      37.428571
      92
      2
      94
    
    
      Newport Hills
      6
      62.833333
      55
      44
      99
    
    
      Monroe
      12
      47.916667
      71
      14
      85
    
    
      Kent
      1
      5.000000
      0
      5
      5
    
    
      Kenmore
      12
      53.916667
      90
      7
      97
    
    
      Issaquah
      15
      67.666667
      73
      10
      83
    
    
      Index
      5
      76.000000
      32
      59
      91
    
    
      Gold Bar
      5
      70.800000
      39
      53
      92
    
    
      Everett
      17
      57.823529
      71
      27
      98
    
    
      Edmonds
      25
      47.640000
      97
      1
      98
    
    
      Duvall
      10
      42.200000
      92
      0
      92
    
    
      Carnation
      5
      59.000000
      64
      24
      88
    
    
      Bothell
      13
      65.692308
      90
      9
      99
    
    
      Bellevue
      36
      44.305556
      95
      1
      96
    
    
      Victoria
      Melbourne
      1
      36.000000
      0
      36
      36
    
    
      Utah
      Nevada
      1
      27.000000
      0
      27
      27
    
    
      Tennessee
      Memphis
      1
      29.000000
      0
      29
      29
    
    
      Oregon
      Portland
      1
      22.000000
      0
      22
      22
    
    
      Ontario
      Ottawa
      1
      31.000000
      0
      31
      31
    
    
      Minnesota
      Minneapolis
      1
      48.000000
      0
      48
      48
    
    
      Duluth
      1
      24.000000
      0
      24
      24
    
    
      Michigan
      Detroit
      1
      38.000000
      0
      38
      38
    
    
      Massachusetts
      Cambridge
      1
      39.000000
      0
      39
      39
    
    
      Hamburg
      Berlin
      1
      35.000000
      0
      35
      35
    
    
      Gironde
      Bordeaux
      1
      34.000000
      0
      34
      34
    
    
      Georgia
      Atlanta
      1
      55.000000
      0
      55
      55
    
    
      England
      Cambridge
      1
      37.000000
      0
      37
      37
    
    
      California
      San Francisco
      2
      21.000000
      10
      16
      26
    
    
      Alberta
      Calgary
      5
      35.000000
      53
      9
      62

4h. Now sort the results by the new column created above



In [326]:

    
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean',
        'MinVacationHours' : min, 
        'MaxVacationHours' : max,
        'DeltaVacationHours' : lambda x: x.max()-x.min()
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

df = Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations)
df.iloc[np.argsort(df['VacationHours']['DeltaVacationHours'])]









    Out[326]:






  
    
      
      
      EmployeeID
      VacationHours
    
    
      
      
      NumEmployees
      AverageVacationHours
      DeltaVacationHours
      MinVacationHours
      MaxVacationHours
    
    
      StateProvinceName
      City
      
      
      
      
      
    
  
  
    
      Oregon
      Portland
      1
      22.000000
      0
      22
      22
    
    
      England
      Cambridge
      1
      37.000000
      0
      37
      37
    
    
      Georgia
      Atlanta
      1
      55.000000
      0
      55
      55
    
    
      Gironde
      Bordeaux
      1
      34.000000
      0
      34
      34
    
    
      Hamburg
      Berlin
      1
      35.000000
      0
      35
      35
    
    
      Massachusetts
      Cambridge
      1
      39.000000
      0
      39
      39
    
    
      Michigan
      Detroit
      1
      38.000000
      0
      38
      38
    
    
      Minnesota
      Duluth
      1
      24.000000
      0
      24
      24
    
    
      Minneapolis
      1
      48.000000
      0
      48
      48
    
    
      Ontario
      Ottawa
      1
      31.000000
      0
      31
      31
    
    
      Washington
      Kent
      1
      5.000000
      0
      5
      5
    
    
      Tennessee
      Memphis
      1
      29.000000
      0
      29
      29
    
    
      Utah
      Nevada
      1
      27.000000
      0
      27
      27
    
    
      Victoria
      Melbourne
      1
      36.000000
      0
      36
      36
    
    
      California
      San Francisco
      2
      21.000000
      10
      16
      26
    
    
      Washington
      Index
      5
      76.000000
      32
      59
      91
    
    
      Gold Bar
      5
      70.800000
      39
      53
      92
    
    
      Alberta
      Calgary
      5
      35.000000
      53
      9
      62
    
    
      Washington
      Newport Hills
      6
      62.833333
      55
      44
      99
    
    
      Carnation
      5
      59.000000
      64
      24
      88
    
    
      Monroe
      12
      47.916667
      71
      14
      85
    
    
      Everett
      17
      57.823529
      71
      27
      98
    
    
      Issaquah
      15
      67.666667
      73
      10
      83
    
    
      Snohomish
      10
      52.800000
      85
      10
      95
    
    
      Seattle
      44
      54.659091
      89
      6
      95
    
    
      Kenmore
      12
      53.916667
      90
      7
      97
    
    
      Bothell
      13
      65.692308
      90
      9
      99
    
    
      Duvall
      10
      42.200000
      92
      0
      92
    
    
      Redmond
      21
      37.428571
      92
      2
      94
    
    
      Sammamish
      17
      42.823529
      93
      0
      93
    
    
      Bellevue
      36
      44.305556
      95
      1
      96
    
    
      Renton
      17
      54.176471
      97
      0
      97
    
    
      Edmonds
      25
      47.640000
      97
      1
      98

4i. Limit the results to cities that have more than 1 employee



In [50]:

    
def my_own_delta(x):
    return  x.max()-x.min()

aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean',
        'MinVacationHours' : np.min, 
        'MaxVacationHours' : np.max,
        'DeltaVacationHours' : my_own_delta, ##lambda x: x.max()-x.min(),
        'cnt' : pd.Series.count
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

df = Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations)
df['VacationHours']['DeltaVacationHours']>1
df2 = df[df['VacationHours']['DeltaVacationHours']>1]

df2.columns = df2.columns.droplevel(0)
df2.reset_index()
##df2.iloc[np.argsort(df2['VacationHours']['DeltaVacationHours'])]
##np.argsort(df2['VacationHours']['DeltaVacationHours'])









    Out[50]:






  
    
      
      StateProvinceName
      City
      cnt
      MinVacationHours
      AverageVacationHours
      MaxVacationHours
      DeltaVacationHours
      NumEmployees
    
  
  
    
      0
      Alberta
      Calgary
      5
      9
      35.000000
      62
      53
      5
    
    
      1
      California
      San Francisco
      2
      16
      21.000000
      26
      10
      2
    
    
      2
      Washington
      Bellevue
      36
      1
      44.305556
      96
      95
      36
    
    
      3
      Washington
      Bothell
      13
      9
      65.692308
      99
      90
      13
    
    
      4
      Washington
      Carnation
      5
      24
      59.000000
      88
      64
      5
    
    
      5
      Washington
      Duvall
      10
      0
      42.200000
      92
      92
      10
    
    
      6
      Washington
      Edmonds
      25
      1
      47.640000
      98
      97
      25
    
    
      7
      Washington
      Everett
      17
      27
      57.823529
      98
      71
      17
    
    
      8
      Washington
      Gold Bar
      5
      53
      70.800000
      92
      39
      5
    
    
      9
      Washington
      Index
      5
      59
      76.000000
      91
      32
      5
    
    
      10
      Washington
      Issaquah
      15
      10
      67.666667
      83
      73
      15
    
    
      11
      Washington
      Kenmore
      12
      7
      53.916667
      97
      90
      12
    
    
      12
      Washington
      Monroe
      12
      14
      47.916667
      85
      71
      12
    
    
      13
      Washington
      Newport Hills
      6
      44
      62.833333
      99
      55
      6
    
    
      14
      Washington
      Redmond
      21
      2
      37.428571
      94
      92
      21
    
    
      15
      Washington
      Renton
      17
      0
      54.176471
      97
      97
      17
    
    
      16
      Washington
      Sammamish
      17
      0
      42.823529
      93
      93
      17
    
    
      17
      Washington
      Seattle
      44
      6
      54.659091
      95
      89
      44
    
    
      18
      Washington
      Snohomish
      10
      10
      52.800000
      95
      85
      10

4j. Limit the results to non-U.S. cities

SELECT CONCAT(e.StateProvinceName, ', ', e.City) AS 'Location' , AVG(e.VacationHours) AS 'Average Vacation Hours' , COUNT(1) AS 'NumEmployees' , MAX(e.VacationHours)-MIN(e.VacationHours) AS 'Difference Vacation Hours' FROM dbo.Employees AS e WHERE e.CountryName != 'United States' GROUP BY e.StateProvinceName, e.City HAVING COUNT(1) > 1 ORDER BY 'Difference Vacation Hours' DESC ;



In [336]:

    
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean',
        'MinVacationHours' : min, 
        'MaxVacationHours' : max,
        'DeltaVacationHours' : lambda x: x.max()-x.min()
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

df = Employees[Employees.CountryName != 'United States'].groupby(by=['StateProvinceName', 'City']).aggregate(aggregations)
df2 = df[df['VacationHours']['DeltaVacationHours']>1]
df2.iloc[np.argsort(df2['VacationHours']['DeltaVacationHours'])]









    Out[336]:






  
    
      
      
      EmployeeID
      VacationHours
    
    
      
      
      NumEmployees
      AverageVacationHours
      DeltaVacationHours
      MinVacationHours
      MaxVacationHours
    
    
      StateProvinceName
      City
      
      
      
      
      
    
  
  
    
      Alberta
      Calgary
      5
      35
      53
      9
      62

Extra

Here are two alternatives to create a 'City, State' column.



In [354]:

    
[cs for cs in map(lambda x,y: str(y)+', '+str(x), Employees.StateProvinceName.values, Employees.City.values)]









    Out[354]:





['Redmond, Washington',
 'Calgary, Alberta',
 'Edmonds, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Issaquah, Washington',
 'Everett, Washington',
 'Seattle, Washington',
 'Bothell, Washington',
 'San Francisco, California',
 'Index, Washington',
 'Redmond, Washington',
 'Snohomish, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Monroe, Washington',
 'Seattle, Washington',
 'Calgary, Alberta',
 'Seattle, Washington',
 'Renton, Washington',
 'Calgary, Alberta',
 'Seattle, Washington',
 'Bothell, Washington',
 'Newport Hills, Washington',
 'Carnation, Washington',
 'Monroe, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Edmonds, Washington',
 'Index, Washington',
 'Sammamish, Washington',
 'Calgary, Alberta',
 'Renton, Washington',
 'Everett, Washington',
 'Duvall, Washington',
 'Bothell, Washington',
 'Issaquah, Washington',
 'Everett, Washington',
 'Sammamish, Washington',
 'Bothell, Washington',
 'Edmonds, Washington',
 'Edmonds, Washington',
 'Edmonds, Washington',
 'Edmonds, Washington',
 'Edmonds, Washington',
 'Snohomish, Washington',
 'Gold Bar, Washington',
 'Redmond, Washington',
 'Seattle, Washington',
 'Everett, Washington',
 'Renton, Washington',
 'Renton, Washington',
 'Monroe, Washington',
 'Bellevue, Washington',
 'Everett, Washington',
 'Seattle, Washington',
 'Redmond, Washington',
 'Gold Bar, Washington',
 'Redmond, Washington',
 'Edmonds, Washington',
 'Sammamish, Washington',
 'Redmond, Washington',
 'Nevada, Utah',
 'Kenmore, Washington',
 'Bothell, Washington',
 'Renton, Washington',
 'Duvall, Washington',
 'Snohomish, Washington',
 'Seattle, Washington',
 'Edmonds, Washington',
 'Renton, Washington',
 'Edmonds, Washington',
 'Bellevue, Washington',
 'Edmonds, Washington',
 'Seattle, Washington',
 'Newport Hills, Washington',
 'Redmond, Washington',
 'Bellevue, Washington',
 'Carnation, Washington',
 'Index, Washington',
 'Snohomish, Washington',
 'Index, Washington',
 'Bellevue, Washington',
 'Everett, Washington',
 'Seattle, Washington',
 'Snohomish, Washington',
 'Edmonds, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Issaquah, Washington',
 'Everett, Washington',
 'Redmond, Washington',
 'Melbourne, Victoria',
 'Issaquah, Washington',
 'Duvall, Washington',
 'Bellevue, Washington',
 'Kent, Washington',
 'Index, Washington',
 'Edmonds, Washington',
 'Everett, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Edmonds, Washington',
 'Bellevue, Washington',
 'Monroe, Washington',
 'Redmond, Washington',
 'Calgary, Alberta',
 'Issaquah, Washington',
 'Gold Bar, Washington',
 'Carnation, Washington',
 'Bellevue, Washington',
 'Everett, Washington',
 'Duvall, Washington',
 'Cambridge, Massachusetts',
 'Kenmore, Washington',
 'Sammamish, Washington',
 'Kenmore, Washington',
 'Snohomish, Washington',
 'Seattle, Washington',
 'Sammamish, Washington',
 'Bellevue, Washington',
 'Renton, Washington',
 'Newport Hills, Washington',
 'Sammamish, Washington',
 'Sammamish, Washington',
 'Monroe, Washington',
 'Seattle, Washington',
 'Kenmore, Washington',
 'Monroe, Washington',
 'Everett, Washington',
 'Bellevue, Washington',
 'Snohomish, Washington',
 'Redmond, Washington',
 'Seattle, Washington',
 'Renton, Washington',
 'Seattle, Washington',
 'Seattle, Washington',
 'Monroe, Washington',
 'Edmonds, Washington',
 'Kenmore, Washington',
 'Duvall, Washington',
 'Seattle, Washington',
 'Kenmore, Washington',
 'Everett, Washington',
 'Issaquah, Washington',
 'Sammamish, Washington',
 'Sammamish, Washington',
 'Seattle, Washington',
 'Seattle, Washington',
 'Duvall, Washington',
 'Seattle, Washington',
 'Edmonds, Washington',
 'Everett, Washington',
 'Kenmore, Washington',
 'Kenmore, Washington',
 'Seattle, Washington',
 'Kenmore, Washington',
 'Renton, Washington',
 'Bothell, Washington',
 'Bellevue, Washington',
 'Minneapolis, Minnesota',
 'Snohomish, Washington',
 'San Francisco, California',
 'Monroe, Washington',
 'Bothell, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Snohomish, Washington',
 'Issaquah, Washington',
 'Renton, Washington',
 'Edmonds, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Portland, Oregon',
 'Redmond, Washington',
 'Bothell, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Renton, Washington',
 'Seattle, Washington',
 'Issaquah, Washington',
 'Everett, Washington',
 'Seattle, Washington',
 'Kenmore, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Edmonds, Washington',
 'Sammamish, Washington',
 'Bothell, Washington',
 'Redmond, Washington',
 'Renton, Washington',
 'Sammamish, Washington',
 'Bellevue, Washington',
 'Kenmore, Washington',
 'Sammamish, Washington',
 'Seattle, Washington',
 'Seattle, Washington',
 'Sammamish, Washington',
 'Renton, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Snohomish, Washington',
 'Bellevue, Washington',
 'Redmond, Washington',
 'Edmonds, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Issaquah, Washington',
 'Seattle, Washington',
 'Bothell, Washington',
 'Edmonds, Washington',
 'Renton, Washington',
 'Seattle, Washington',
 'Seattle, Washington',
 'Everett, Washington',
 'Issaquah, Washington',
 'Seattle, Washington',
 'Monroe, Washington',
 'Seattle, Washington',
 'Issaquah, Washington',
 'Bellevue, Washington',
 'Sammamish, Washington',
 'Seattle, Washington',
 'Everett, Washington',
 'Bellevue, Washington',
 'Sammamish, Washington',
 'Renton, Washington',
 'Duluth, Minnesota',
 'Redmond, Washington',
 'Gold Bar, Washington',
 'Redmond, Washington',
 'Edmonds, Washington',
 'Detroit, Michigan',
 'Bothell, Washington',
 'Kenmore, Washington',
 'Duvall, Washington',
 'Seattle, Washington',
 'Memphis, Tennessee',
 'Monroe, Washington',
 'Sammamish, Washington',
 'Edmonds, Washington',
 'Duvall, Washington',
 'Redmond, Washington',
 'Issaquah, Washington',
 'Everett, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Bellevue, Washington',
 'Seattle, Washington',
 'Bellevue, Washington',
 'Monroe, Washington',
 'Newport Hills, Washington',
 'Redmond, Washington',
 'Ottawa, Ontario',
 'Sammamish, Washington',
 'Everett, Washington',
 'Edmonds, Washington',
 'Newport Hills, Washington',
 'Seattle, Washington',
 'Issaquah, Washington',
 'Bellevue, Washington',
 'Redmond, Washington',
 'Bordeaux, Gironde',
 'Bellevue, Washington',
 'Renton, Washington',
 'Issaquah, Washington',
 'Renton, Washington',
 'Seattle, Washington',
 'Bothell, Washington',
 'Gold Bar, Washington',
 'Monroe, Washington',
 'Carnation, Washington',
 'Newport Hills, Washington',
 'Duvall, Washington',
 'Bellevue, Washington',
 'Carnation, Washington',
 'Edmonds, Washington',
 'Edmonds, Washington',
 'Bothell, Washington',
 'Redmond, Washington',
 'Redmond, Washington',
 'Issaquah, Washington',
 'Redmond, Washington',
 'Duvall, Washington',
 'Seattle, Washington',
 'Cambridge, England',
 'Berlin, Hamburg',
 'Atlanta, Georgia']



In [356]:

    
Employees.City.str.cat(Employees.StateProvinceName, sep=', ')









    Out[356]:





0            Redmond, Washington
1               Calgary, Alberta
2            Edmonds, Washington
3            Seattle, Washington
4           Bellevue, Washington
5            Seattle, Washington
6           Issaquah, Washington
7            Everett, Washington
8            Seattle, Washington
9            Bothell, Washington
10     San Francisco, California
11             Index, Washington
12           Redmond, Washington
13         Snohomish, Washington
14          Bellevue, Washington
15           Seattle, Washington
16            Monroe, Washington
17           Seattle, Washington
18              Calgary, Alberta
19           Seattle, Washington
20            Renton, Washington
21              Calgary, Alberta
22           Seattle, Washington
23           Bothell, Washington
24     Newport Hills, Washington
25         Carnation, Washington
26            Monroe, Washington
27           Seattle, Washington
28          Bellevue, Washington
29           Edmonds, Washington
                 ...            
261          Seattle, Washington
262         Issaquah, Washington
263         Bellevue, Washington
264          Redmond, Washington
265            Bordeaux, Gironde
266         Bellevue, Washington
267           Renton, Washington
268         Issaquah, Washington
269           Renton, Washington
270          Seattle, Washington
271          Bothell, Washington
272         Gold Bar, Washington
273           Monroe, Washington
274        Carnation, Washington
275    Newport Hills, Washington
276           Duvall, Washington
277         Bellevue, Washington
278        Carnation, Washington
279          Edmonds, Washington
280          Edmonds, Washington
281          Bothell, Washington
282          Redmond, Washington
283          Redmond, Washington
284         Issaquah, Washington
285          Redmond, Washington
286           Duvall, Washington
287          Seattle, Washington
288           Cambridge, England
289              Berlin, Hamburg
290             Atlanta, Georgia
Name: City, dtype: object

Summary

In the simplest form one can specify either rows or columns



In [ ]:

    
# First ten rows
Employees[0:10]



In [5]:

    
# Rows satisfying a particular condition
Employees[Employees.MaritalStatus=='M']









    Out[5]:






  
    
      
      EmployeeID
      FirstName
    
  
  
    
      0
      259
      Ben
    
    
      1
      278
      Garrett
    
    
      2
      204
      Gabe
    
    
      3
      78
      Reuben
    
    
      4
      255
      Gordon
    
    
      10
      10
      Michael
    
    
      12
      175
      Reed
    
    
      14
      144
      Paul
    
    
      15
      192
      Brenda
    
    
      18
      203
      Ken
    
    
      19
      245
      Barbara
    
    
      24
      163
      Alex
    
    
      25
      118
      Don
    
    
      26
      176
      David
    
    
      27
      147
      Sandra
    
    
      28
      138
      Samantha
    
    
      29
      95
      Jim
    
    
      30
      231
      Jo
    
    
      31
      86
      Ryan
    
    
      33
      214
      Andreas
    
    
      34
      72
      Steven
    
    
      36
      107
      Christopher
    
    
      37
      269
      Dan
    
    
      38
      187
      Yvonne
    
    
      40
      12
      Thierry
    
    
      42
      30
      Britta
    
    
      43
      198
      Lorraine
    
    
      45
      31
      Margie
    
    
      48
      3
      Roberto
    
    
      49
      199
      Paula
    
    
      ...
      ...
      ...
    
    
      227
      136
      Sylvester
    
    
      231
      35
      Brandon
    
    
      232
      228
      Christian
    
    
      233
      223
      Sairaj
    
    
      234
      100
      Lolan
    
    
      239
      200
      Frank
    
    
      240
      279
      Tsvi
    
    
      243
      73
      Carole
    
    
      249
      21
      Terry
    
    
      252
      254
      Fukiko
    
    
      255
      53
      Diane
    
    
      256
      282
      José
    
    
      258
      209
      Kathie
    
    
      259
      50
      Sidney
    
    
      261
      70
      David
    
    
      264
      140
      Prasanna
    
    
      266
      5
      Gail
    
    
      267
      252
      Arvind
    
    
      269
      233
      Magnus
    
    
      270
      83
      Patrick
    
    
      272
      241
      David
    
    
      275
      239
      Mindy
    
    
      278
      38
      Kim
    
    
      279
      154
      Raymond
    
    
      280
      114
      Mindaugas
    
    
      281
      123
      Vamsi
    
    
      283
      101
      Houman
    
    
      284
      221
      Chris
    
    
      288
      289
      Jae
    
    
      290
      999
      Chadwick
    
  

147 rows × 2 columns



In [ ]:

    
# Select sertain columns
Employees[['EmployeeID', 'FirstName']]

Since expressions return DataFrames we can chain method calls and [...] operators

The order matters!



In [9]:

    
Employees[0:10][Employees.MaritalStatus=='M'][['EmployeeID', 'FirstName', 'MaritalStatus']]









    



/usr/lib/python3.4/site-packages/ipykernel/__main__.py:1: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  if __name__ == '__main__':






    Out[9]:






  
    
      
      EmployeeID
      FirstName
      MaritalStatus
    
  
  
    
      0
      259
      Ben
      M
    
    
      1
      278
      Garrett
      M
    
    
      2
      204
      Gabe
      M
    
    
      3
      78
      Reuben
      M
    
    
      4
      255
      Gordon
      M



In [10]:

    
Employees[Employees.MaritalStatus=='M'][['EmployeeID', 'FirstName', 'MaritalStatus']][0:10]









    Out[10]:






  
    
      
      EmployeeID
      FirstName
      MaritalStatus
    
  
  
    
      0
      259
      Ben
      M
    
    
      1
      278
      Garrett
      M
    
    
      2
      204
      Gabe
      M
    
    
      3
      78
      Reuben
      M
    
    
      4
      255
      Gordon
      M
    
    
      10
      10
      Michael
      M
    
    
      12
      175
      Reed
      M
    
    
      14
      144
      Paul
      M
    
    
      15
      192
      Brenda
      M
    
    
      18
      203
      Ken
      M

Recommended to use .loc .iloc .at methods

Grouping should be performed by on factors (i.e. discrete values), aggregation usually on numeric values.



In [14]:

    
Employees.groupby(by=['MaritalStatus']).VacationHours.sum()









    Out[14]:





MaritalStatus
M    7075
S    7658
Name: VacationHours, dtype: int64

Use 'aggregate' for more complex groupings and aggregations



In [ ]:

    
aggregations = {
    'VacationHours' : {
        'AverageVacationHours' : 'mean',
        'MinVacationHours' : min, 
        'MaxVacationHours' : max,
        'DeltaVacationHoues' : lambda x: x.max()-x.min()
    },
    'EmployeeID' : {
        'NumEmployees' : 'count'
    },
}

Employees.groupby(by=['StateProvinceName', 'City']).aggregate(aggregations) \
.sort_index(ascending=False, axis=0)

Assign intermediate results to new variables, and perform next operation on them

In particular to saticfy filtering ('HAVING') and sorting on aggregated values

Columns are not lists. Methods on columns are defined in Pandas.Series and NumPy



In [18]:

    
help(np.min)
help(pd.Series.between)









    



Help on function amin in module numpy.core.fromnumeric:

amin(a, axis=None, out=None, keepdims=False)
    Return the minimum of an array or minimum along an axis.
    
    Parameters
    ----------
    a : array_like
        Input data.
    axis : None or int or tuple of ints, optional
        Axis or axes along which to operate.  By default, flattened input is
        used.
    
        .. versionadded: 1.7.0
    
        If this is a tuple of ints, the minimum is selected over multiple axes,
        instead of a single axis or all the axes as before.
    out : ndarray, optional
        Alternative output array in which to place the result.  Must
        be of the same shape and buffer length as the expected output.
        See `doc.ufuncs` (Section "Output arguments") for more details.
    keepdims : bool, optional
        If this is set to True, the axes which are reduced are left
        in the result as dimensions with size one. With this option,
        the result will broadcast correctly against the original `arr`.
    
    Returns
    -------
    amin : ndarray or scalar
        Minimum of `a`. If `axis` is None, the result is a scalar value.
        If `axis` is given, the result is an array of dimension
        ``a.ndim - 1``.
    
    See Also
    --------
    amax :
        The maximum value of an array along a given axis, propagating any NaNs.
    nanmin :
        The minimum value of an array along a given axis, ignoring any NaNs.
    minimum :
        Element-wise minimum of two arrays, propagating any NaNs.
    fmin :
        Element-wise minimum of two arrays, ignoring any NaNs.
    argmin :
        Return the indices of the minimum values.
    
    nanmax, maximum, fmax
    
    Notes
    -----
    NaN values are propagated, that is if at least one item is NaN, the
    corresponding min value will be NaN as well. To ignore NaN values
    (MATLAB behavior), please use nanmin.
    
    Don't use `amin` for element-wise comparison of 2 arrays; when
    ``a.shape[0]`` is 2, ``minimum(a[0], a[1])`` is faster than
    ``amin(a, axis=0)``.
    
    Examples
    --------
    >>> a = np.arange(4).reshape((2,2))
    >>> a
    array([[0, 1],
           [2, 3]])
    >>> np.amin(a)           # Minimum of the flattened array
    0
    >>> np.amin(a, axis=0)   # Minima along the first axis
    array([0, 1])
    >>> np.amin(a, axis=1)   # Minima along the second axis
    array([0, 2])
    
    >>> b = np.arange(5, dtype=np.float)
    >>> b[2] = np.NaN
    >>> np.amin(b)
    nan
    >>> np.nanmin(b)
    0.0

Help on function between in module pandas.core.series:

between(self, left, right, inclusive=True)
    Return boolean Series equivalent to left <= series <= right. NA values
    will be treated as False
    
    Parameters
    ----------
    left : scalar
        Left boundary
    right : scalar
        Right boundary
    
    Returns
    -------
    is_between : Series



In [ ]:



In [ ]:

	EmployeeID	FirstName	LastName	MaritalStatus
0	259	Ben	Miller	M
1	278	Garrett	Vargas	M
2	204	Gabe	Mares	M
3	78	Reuben	D'sa	M
4	255	Gordon	Hee	M
10	10	Michael	Raheem	M
12	175	Reed	Koch	M
14	144	Paul	Singh	M
15	192	Brenda	Diaz	M
18	203	Ken	Myer	M
19	245	Barbara	Moreland	M
24	163	Alex	Nayberg	M
25	118	Don	Hall	M
26	176	David	Lawrence	M
27	147	Sandra	Reátegui Alayo	M
28	138	Samantha	Smith	M
29	95	Jim	Scardelis	M
30	231	Jo	Berry	M
31	86	Ryan	Cornelsen	M
33	214	Andreas	Berglund	M
34	72	Steven	Selikoff	M
36	107	Christopher	Hill	M
37	269	Dan	Bacon	M
38	187	Yvonne	McKay	M
40	12	Thierry	D'Hers	M
42	30	Britta	Simon	M
43	198	Lorraine	Nay	M
45	31	Margie	Shoop	M
48	3	Roberto	Tamburello	M
49	199	Paula	Nartker	M
...	...	...	...	...
227	136	Sylvester	Valdez	M
231	35	Brandon	Heidepriem	M
232	228	Christian	Kleinerman	M
233	223	Sairaj	Uddin	M
234	100	Lolan	Song	M
239	200	Frank	Lee	M
240	279	Tsvi	Reiter	M
243	73	Carole	Poland	M
249	21	Terry	Eminhizer	M
252	254	Fukiko	Ogisu	M
255	53	Diane	Glimp	M
256	282	José	Saraiva	M
258	209	Kathie	Flood	M
259	50	Sidney	Higa	M
261	70	David	Ortiz	M
264	140	Prasanna	Samarawickrama	M
266	5	Gail	Erickson	M
267	252	Arvind	Rao	M
269	233	Magnus	Hedlund	M
270	83	Patrick	Cook	M
272	241	David	Liu	M
275	239	Mindy	Martin	M
278	38	Kim	Abercrombie	M
279	154	Raymond	Sam	M
280	114	Mindaugas	Krapauskas	M
281	123	Vamsi	Kuppa	M
283	101	Houman	Pournasseh	M
284	221	Chris	Norred	M
288	289	Jae	Pak	M
290	999	Chadwick	Smith	M

	EmployeeID	ManagerID	TerritoryID	Title	FirstName	MiddleName	LastName	Suffix	JobTitle	NationalIDNumber	...	SickLeaveHours	PhoneNumber	PhoneNumberType	EmailAddress	AddressLine1	AddressLine2	City	StateProvinceName	PostalCode	CountryName
9	124	250.0	NaN	NaN	Kim	T	Ralls	NaN	Stocker	420776180	...	69	309-555-0129	Work	kim0@adventure-works.com	1226 Shoe St.	NaN	Bothell	Washington	98011	United States
10	10	25.0	NaN	NaN	Michael	NaN	Raheem	NaN	Research and Development Manager	879342154	...	64	330-555-2568	Work	michael6@adventure-works.com	1234 Seaside Way	NaN	San Francisco	California	94109	United States
16	166	26.0	NaN	NaN	Jack	S	Richins	NaN	Production Supervisor - WC30	60114406	...	54	552-555-0111	Work	jack0@adventure-works.com	1356 Grove Way	NaN	Monroe	Washington	98272	United States
27	147	26.0	NaN	NaN	Sandra	NaN	Reátegui Alayo	NaN	Production Technician - WC30	604664374	...	38	896-555-0168	Work	sandra0@adventure-works.com	1619 Stillman Court	NaN	Seattle	Washington	98104	United States
44	133	26.0	NaN	NaN	Michael	L	Rothkugel	NaN	Production Technician - WC40	830150469	...	55	454-555-0119	Work	michael1@adventure-works.com	207 Berry Court	NaN	Edmonds	Washington	98020	United States
95	44	26.0	NaN	NaN	Simon	D	Rapier	NaN	Production Technician - WC60	276751903	...	39	963-555-0134	Work	simon0@adventure-works.com	3421 Bouncing Road	NaN	Duvall	Washington	98019	United States
99	65	26.0	NaN	NaN	Randy	T	Reeves	NaN	Production Technician - WC60	8066363	...	34	961-555-0122	Cell	randy0@adventure-works.com	3632 Bank Way	NaN	Edmonds	Washington	98020	United States
128	145	26.0	NaN	NaN	Cynthia	S	Randall	NaN	Production Supervisor - WC30	386315192	...	54	352-555-0138	Cell	cynthia0@adventure-works.com	463 H Stagecoach Rd.	NaN	Kenmore	Washington	98028	United States
131	149	26.0	NaN	NaN	Andy	M	Ruth	NaN	Production Technician - WC30	764853868	...	39	118-555-0110	Cell	andy0@adventure-works.com	4777 Rockne Drive	NaN	Bellevue	Washington	98004	United States
166	74	26.0	NaN	NaN	Bjorn	M	Rettig	NaN	Production Technician - WC30	420023788	...	41	199-555-0117	Work	bjorn0@adventure-works.com	5802 Ampersand Drive	NaN	Seattle	Washington	98104	United States
180	71	26.0	NaN	NaN	Michael	Sean	Ray	NaN	Production Supervisor - WC30	578935259	...	55	156-555-0199	Cell	michael3@adventure-works.com	6498 Mining Rd.	NaN	Seattle	Washington	98104	United States
188	190	26.0	NaN	NaN	Robert	J	Rounthwaite	NaN	Production Technician - WC45	456839592	...	61	589-555-0147	Cell	robert0@adventure-works.com	6843 San Simeon Dr.	NaN	Seattle	Washington	98104	United States
219	157	26.0	NaN	NaN	Linda	A	Randall	NaN	Production Technician - WC20	812797414	...	26	696-555-0157	Work	linda1@adventure-works.com	77 Birchwood	NaN	Seattle	Washington	98104	United States
240	279	274.0	5.0	NaN	Tsvi	Michael	Reiter	NaN	Sales Representative	716374314	...	34	664-555-0112	Work	tsvi0@adventure-works.com	8291 Crossbow Way	NaN	Memphis	Tennessee	38103	United States
267	252	250.0	NaN	NaN	Arvind	B	Rao	NaN	Buyer	792847334	...	50	848-555-0163	Cell	arvind0@adventure-works.com	9495 Limewood Place	NaN	Renton	Washington	98055	United States

	EmployeeID	FirstName	LastName
0	259	Ben	Miller
8	161	Kirk	Koenigsbauer
18	203	Ken	Myer
49	199	Paula	Nartker
53	41	Bryan	Baker
56	104	Mary	Baker
64	225	Alan	Brewer
75	156	Lane	Sacksteder
95	44	Simon	Rapier
97	96	Elizabeth	Keyser
98	229	Lori	Penor
127	39	Ed	Dudenhoefer
129	42	James	Kramer
165	117	Chad	Niswonger
168	57	Frank	Miller
172	105	Kevin	Homer
173	253	Linda	Meisner
196	7	Dylan	Miller
200	183	Barbara	Decker
203	142	Olinda	Turner
216	94	Russell	Hunter
222	273	Brian	Welcker
229	216	Sean	Alexander
240	279	Tsvi	Reiter
249	21	Terry	Eminhizer
254	262	David	Barber
282	29	Mark	McArthur

	EmployeeID	FirstName	LastName
114	284	Tete	Mensa-Annan
134	180	Katie	McAskill-White
176	280	Pamela	Ansman-Wolfe

	EmployeeID	FirstName	LastName	SalariedFlag	VacationHours
6	270	François	Ajenstat	1	67
11	248	Mike	Seamans	1	59
19	245	Barbara	Moreland	1	58
37	269	Dan	Bacon	1	72
57	263	Jean	Trenary	1	65
65	212	Peng	Wu	1	81
66	227	Gary	Altman	1	86
93	286	Lynn	Tsoflias	1	36
101	26	Peter	Krebs	1	43
106	16	David	Bradley	1	40
108	268	Ramesh	Meyyappan	1	73
114	284	Tete	Mensa-Annan	1	39
121	235	Paula	Barreto de Mattos	1	54
123	1	Ken	Sánchez	1	99
125	249	Wendy	Kahn	1	55
130	8	Diane	Margheim	1	62
140	211	Hazem	Abolrous	1	80
145	267	Karen	Berg	1	74
147	250	Sheela	Word	1	49
151	121	Pilar	Ackerman	1	93
167	271	Dan	Wilson	1	66
169	272	Janaina	Bueno	1	71
196	7	Dylan	Miller	1	61
210	264	Stephanie	Conroy	1	68
223	9	Gigi	Matthew	1	63
232	228	Christian	Kleinerman	1	92
235	275	Michael	Blythe	1	38
272	241	David	Liu	1	57
273	25	James	Hamilton	1	64
288	289	Jae	Pak	1	37

	EmployeeID	Title	FirstName	LastName	City
0	259	NaN	Ben	Miller	Redmond
1	278	NaN	Garrett	Vargas	Calgary
2	204	NaN	Gabe	Mares	Edmonds
3	78	NaN	Reuben	D'sa	Seattle
4	255	NaN	Gordon	Hee	Bellevue
5	66	NaN	Karan	Khanna	Seattle
6	270	NaN	François	Ajenstat	Issaquah
7	22	NaN	Sariya	Harnpadoungsataya	Everett
8	161	NaN	Kirk	Koenigsbauer	Seattle
9	124	NaN	Kim	Ralls	Bothell
10	10	NaN	Michael	Raheem	San Francisco
11	248	NaN	Mike	Seamans	Index
12	175	NaN	Reed	Koch	Redmond
13	155	NaN	Fadi	Fakhouri	Snohomish
14	144	NaN	Paul	Singh	Bellevue
15	192	NaN	Brenda	Diaz	Seattle
16	166	NaN	Jack	Richins	Monroe
17	112	NaN	John	Evans	Seattle
18	203	NaN	Ken	Myer	Calgary
19	245	NaN	Barbara	Moreland	Seattle
20	257	NaN	Eric	Kurjan	Renton
21	244	NaN	Bryan	Walton	Calgary
22	69	NaN	Steve	Masters	Seattle
23	210	NaN	Belinda	Newman	Bothell
24	163	NaN	Alex	Nayberg	Newport Hills
25	118	NaN	Don	Hall	Carnation
26	176	NaN	David	Lawrence	Monroe
27	147	NaN	Sandra	Reátegui Alayo	Seattle
28	138	NaN	Samantha	Smith	Bellevue
29	95	NaN	Jim	Scardelis	Edmonds
...	...	...	...	...	...
260	222	NaN	A. Scott	Wright	Newport Hills
261	70	NaN	David	Ortiz	Seattle
262	219	NaN	Sean	Chai	Issaquah
263	148	NaN	Jason	Watters	Bellevue
264	140	NaN	Prasanna	Samarawickrama	Redmond
265	290	NaN	Ranjit	Varkey Chudukatil	Bordeaux
267	252	NaN	Arvind	Rao	Renton
268	266	NaN	Peter	Connelly	Issaquah
269	233	NaN	Magnus	Hedlund	Renton
270	83	NaN	Patrick	Cook	Seattle
271	251	NaN	Mikael	Sandberg	Bothell
272	241	NaN	David	Liu	Gold Bar
273	25	NaN	James	Hamilton	Monroe
274	108	NaN	Jinghao	Liu	Carnation
275	239	NaN	Mindy	Martin	Newport Hills
276	27	NaN	Jo	Brown	Duvall
277	186	NaN	Shane	Kim	Bellevue
278	38	NaN	Kim	Abercrombie	Carnation
279	154	NaN	Raymond	Sam	Edmonds
280	114	NaN	Mindaugas	Krapauskas	Edmonds
281	123	NaN	Vamsi	Kuppa	Bothell
282	29	NaN	Mark	McArthur	Redmond
283	101	NaN	Houman	Pournasseh	Redmond
284	221	NaN	Chris	Norred	Issaquah
285	18	NaN	John	Wood	Redmond
286	46	NaN	Eugene	Kogan	Duvall
287	202	NaN	Tawana	Nusbaum	Seattle
288	289	NaN	Jae	Pak	Cambridge
289	288	NaN	Rachel	Valdez	Berlin
290	999	NaN	Chadwick	Smith	Atlanta

		min	max
Gender	SalariedFlag
F	0	1948-05-25	1985-07-01
F	1	1946-10-29	1980-07-06
M	0	1946-04-03	1985-05-07
M	1	1945-11-17	1982-04-14

	mean	amin
JobTitle
Accountant	58.500000	58
Accounts Manager	57.000000	57
Accounts Payable Specialist	63.500000	63
Accounts Receivable Specialist	61.000000	60
Application Specialist	72.500000	71
Assistant to the Chief Financial Officer	56.000000	56
BI Professor	55.000000	55
Benefits Specialist	51.000000	51
Buyer	56.000000	52
Chief Executive Officer	99.000000	99
Chief Financial Officer	0.000000	0
Control Specialist	75.500000	75
Database Administrator	66.500000	66
Design Engineer	5.000000	4
Document Control Assistant	78.500000	78
Document Control Manager	77.000000	77
Engineering Manager	2.000000	2
European Sales Manager	21.000000	21
Facilities Administrative Assistant	87.000000	87
Facilities Manager	86.000000	86
Finance Manager	55.000000	55
Human Resources Administrative Assistant	52.500000	52
Human Resources Manager	54.000000	54
Information Services Manager	65.000000	65
Janitor	89.500000	88
Maintenance Supervisor	92.000000	92
Marketing Assistant	42.000000	41
Marketing Manager	40.000000	40
Marketing Specialist	46.000000	44
Master Scheduler	44.000000	44
...	...	...
Production Supervisor - WC40	72.000000	71
Production Supervisor - WC45	75.000000	74
Production Supervisor - WC50	78.000000	77
Production Supervisor - WC60	81.000000	80
Production Technician - WC10	91.000000	83
Production Technician - WC20	10.500000	0
Production Technician - WC30	34.000000	22
Production Technician - WC40	59.500000	47
Production Technician - WC45	80.000000	73
Production Technician - WC50	46.653846	0
Production Technician - WC60	26.500000	14
Purchasing Assistant	50.500000	50
Purchasing Manager	49.000000	49
Quality Assurance Manager	80.000000	80
Quality Assurance Supervisor	81.000000	81
Quality Assurance Technician	83.500000	82
Recruiter	49.500000	49
Research and Development Engineer	62.500000	62
Research and Development Manager	38.500000	16
Sales Representative	31.000000	22
Scheduling Assistant	46.500000	45
Senior Design Engineer	3.000000	3
Senior Tool Designer	27.500000	7
Shipping and Receiving Clerk	94.500000	94
Shipping and Receiving Supervisor	93.000000	93
Stocker	97.000000	96
Tool Designer	8.500000	8
Vice President of Engineering	1.000000	1
Vice President of Production	64.000000	64
Vice President of Sales	10.000000	10

		mean	amin
JobTitle	MaritalStatus
Accountant	M	58.000000	58
Accountant	S	59.000000	59
Accounts Manager	M	57.000000	57
Accounts Payable Specialist	M	63.500000	63
Accounts Receivable Specialist	M	60.000000	60
Accounts Receivable Specialist	S	61.500000	61
Application Specialist	M	71.500000	71
Application Specialist	S	73.500000	73
Assistant to the Chief Financial Officer	S	56.000000	56
BI Professor	M	55.000000	55
Benefits Specialist	M	51.000000	51
Buyer	M	56.333333	52
Buyer	S	55.333333	53
Chief Executive Officer	S	99.000000	99
Chief Financial Officer	M	0.000000	0
Control Specialist	M	75.000000	75
Control Specialist	S	76.000000	76
Database Administrator	M	66.000000	66
Database Administrator	S	67.000000	67
Design Engineer	M	5.000000	4
Document Control Assistant	M	79.000000	79
Document Control Assistant	S	78.000000	78
Document Control Manager	M	77.000000	77
Engineering Manager	M	2.000000	2
European Sales Manager	M	21.000000	21
Facilities Administrative Assistant	M	87.000000	87
Facilities Manager	M	86.000000	86
Finance Manager	S	55.000000	55
Human Resources Administrative Assistant	S	52.500000	52
Human Resources Manager	M	54.000000	54
...	...	...	...
Production Technician - WC40	S	61.857143	47
Production Technician - WC45	M	80.833333	76
Production Technician - WC45	S	79.444444	73
Production Technician - WC50	M	40.076923	1
Production Technician - WC50	S	53.230769	0
Production Technician - WC60	M	24.714286	14
Production Technician - WC60	S	28.583333	16
Purchasing Assistant	M	50.500000	50
Purchasing Manager	S	49.000000	49
Quality Assurance Manager	S	80.000000	80
Quality Assurance Supervisor	M	81.000000	81
Quality Assurance Technician	M	84.500000	84
Quality Assurance Technician	S	82.500000	82
Recruiter	S	49.500000	49
Research and Development Engineer	M	63.000000	63
Research and Development Engineer	S	62.000000	62
Research and Development Manager	M	38.500000	16
Sales Representative	M	31.714286	26
Sales Representative	S	30.285714	22
Scheduling Assistant	M	46.500000	45
Senior Design Engineer	S	3.000000	3
Senior Tool Designer	S	27.500000	7
Shipping and Receiving Clerk	M	94.500000	94
Shipping and Receiving Supervisor	S	93.000000	93
Stocker	M	96.000000	96
Stocker	S	97.500000	97
Tool Designer	M	8.500000	8
Vice President of Engineering	S	1.000000	1
Vice President of Production	S	64.000000	64
Vice President of Sales	S	10.000000	10

			mean	amin	amax
JobTitle	MaritalStatus	StateProvinceName
Accountant	M	Washington	58.000000	58	58
Accountant	S	Washington	59.000000	59	59
Accounts Manager	M	Washington	57.000000	57	57
Accounts Payable Specialist	M	Washington	63.500000	63	64
Accounts Receivable Specialist	M	Washington	60.000000	60	60
	S	Alberta	62.000000	62	62
	S	Washington	61.000000	61	61
Application Specialist	M	Washington	71.500000	71	72
Application Specialist	S	Washington	73.500000	73	74
Assistant to the Chief Financial Officer	S	Washington	56.000000	56	56
BI Professor	M	Georgia	55.000000	55	55
Benefits Specialist	M	Washington	51.000000	51	51
Buyer	M	Washington	56.333333	52	60
Buyer	S	Washington	55.333333	53	59
Chief Executive Officer	S	Washington	99.000000	99	99
Chief Financial Officer	M	Washington	0.000000	0	0
Control Specialist	M	Washington	75.000000	75	75
Control Specialist	S	Washington	76.000000	76	76
Database Administrator	M	Washington	66.000000	66	66
Database Administrator	S	Washington	67.000000	67	67
Design Engineer	M	Washington	5.000000	4	6
Document Control Assistant	M	Washington	79.000000	79	79
Document Control Assistant	S	Washington	78.000000	78	78
Document Control Manager	M	Washington	77.000000	77	77
Engineering Manager	M	Washington	2.000000	2	2
European Sales Manager	M	Washington	21.000000	21	21
Facilities Administrative Assistant	M	Washington	87.000000	87	87
Facilities Manager	M	Washington	86.000000	86	86
Finance Manager	S	Washington	55.000000	55	55
Human Resources Administrative Assistant	S	Washington	52.500000	52	53
...	...	...	...	...	...
Research and Development Engineer	M	Washington	63.000000	63	63
Research and Development Engineer	S	Washington	62.000000	62	62
Research and Development Manager	M	California	16.000000	16	16
Research and Development Manager	M	Washington	61.000000	61	61
Sales Representative	M	Alberta	33.000000	33	33
		California	26.000000	26	26
		England	37.000000	37	37
		Massachusetts	39.000000	39	39
		Ontario	31.000000	31	31
		Tennessee	29.000000	29	29
		Utah	27.000000	27	27
	S	Gironde	34.000000	34	34
		Hamburg	35.000000	35	35
		Michigan	38.000000	38	38
		Minnesota	24.000000	24	24
		Oregon	22.000000	22	22
		Victoria	36.000000	36	36
		Washington	23.000000	23	23
Scheduling Assistant	M	Washington	46.500000	45	48
Senior Design Engineer	S	Washington	3.000000	3	3
Senior Tool Designer	S	Minnesota	48.000000	48	48
Senior Tool Designer	S	Washington	7.000000	7	7
Shipping and Receiving Clerk	M	Washington	94.500000	94	95
Shipping and Receiving Supervisor	S	Washington	93.000000	93	93
Stocker	M	Washington	96.000000	96	96
Stocker	S	Washington	97.500000	97	98
Tool Designer	M	Washington	8.500000	8	9
Vice President of Engineering	S	Washington	1.000000	1	1
Vice President of Production	S	Washington	64.000000	64	64
Vice President of Sales	S	Washington	10.000000	10	10

		VacationHours	EmployeeID
		AverageVacationHours	NumEmployees
StateProvinceName	City
Alberta	Calgary	35.000000	5
California	San Francisco	21.000000	2
England	Cambridge	37.000000	1
Georgia	Atlanta	55.000000	1
Gironde	Bordeaux	34.000000	1
Hamburg	Berlin	35.000000	1
Massachusetts	Cambridge	39.000000	1
Michigan	Detroit	38.000000	1
Minnesota	Duluth	24.000000	1
Minnesota	Minneapolis	48.000000	1
Ontario	Ottawa	31.000000	1
Oregon	Portland	22.000000	1
Tennessee	Memphis	29.000000	1
Utah	Nevada	27.000000	1
Victoria	Melbourne	36.000000	1
Washington	Bellevue	44.305556	36
	Bothell	65.692308	13
	Carnation	59.000000	5
	Duvall	42.200000	10
	Edmonds	47.640000	25
	Everett	57.823529	17
	Gold Bar	70.800000	5
	Index	76.000000	5
	Issaquah	67.666667	15
	Kenmore	53.916667	12
	Kent	5.000000	1
	Monroe	47.916667	12
	Newport Hills	62.833333	6
	Redmond	37.428571	21
	Renton	54.176471	17
	Sammamish	42.823529	17
	Seattle	54.659091	44
	Snohomish	52.800000	10

	EmployeeID	VacationHours
	NumEmployees	AverageVacationHours
CityState
Snohomish, Washington	10	52.800000
Seattle, Washington	44	54.659091
San Francisco, California	2	21.000000
Sammamish, Washington	17	42.823529
Renton, Washington	17	54.176471
Redmond, Washington	21	37.428571
Portland, Oregon	1	22.000000
Ottawa, Ontario	1	31.000000
Newport Hills, Washington	6	62.833333
Nevada, Utah	1	27.000000
Monroe, Washington	12	47.916667
Minneapolis, Minnesota	1	48.000000
Memphis, Tennessee	1	29.000000
Melbourne, Victoria	1	36.000000
Kent, Washington	1	5.000000
Kenmore, Washington	12	53.916667
Issaquah, Washington	15	67.666667
Index, Washington	5	76.000000
Gold Bar, Washington	5	70.800000
Everett, Washington	17	57.823529
Edmonds, Washington	25	47.640000
Duvall, Washington	10	42.200000
Duluth, Minnesota	1	24.000000
Detroit, Michigan	1	38.000000
Carnation, Washington	5	59.000000
Cambridge, Massachusetts	1	39.000000
Cambridge, England	1	37.000000
Calgary, Alberta	5	35.000000
Bothell, Washington	13	65.692308
Bordeaux, Gironde	1	34.000000
Berlin, Hamburg	1	35.000000
Bellevue, Washington	36	44.305556
Atlanta, Georgia	1	55.000000

		EmployeeID	VacationHours
		NumEmployees	AverageVacationHours	DeltaVacationHoues	MinVacationHours	MaxVacationHours
StateProvinceName	City
Washington	Snohomish	10	52.800000	85	10	95
	Seattle	44	54.659091	89	6	95
	Sammamish	17	42.823529	93	0	93
	Renton	17	54.176471	97	0	97
	Redmond	21	37.428571	92	2	94
	Newport Hills	6	62.833333	55	44	99
	Monroe	12	47.916667	71	14	85
	Kent	1	5.000000	0	5	5
	Kenmore	12	53.916667	90	7	97
	Issaquah	15	67.666667	73	10	83
	Index	5	76.000000	32	59	91
	Gold Bar	5	70.800000	39	53	92
	Everett	17	57.823529	71	27	98
	Edmonds	25	47.640000	97	1	98
	Duvall	10	42.200000	92	0	92
	Carnation	5	59.000000	64	24	88
	Bothell	13	65.692308	90	9	99
	Bellevue	36	44.305556	95	1	96
Victoria	Melbourne	1	36.000000	0	36	36
Utah	Nevada	1	27.000000	0	27	27
Tennessee	Memphis	1	29.000000	0	29	29
Oregon	Portland	1	22.000000	0	22	22
Ontario	Ottawa	1	31.000000	0	31	31
Minnesota	Minneapolis	1	48.000000	0	48	48
Minnesota	Duluth	1	24.000000	0	24	24
Michigan	Detroit	1	38.000000	0	38	38
Massachusetts	Cambridge	1	39.000000	0	39	39
Hamburg	Berlin	1	35.000000	0	35	35
Gironde	Bordeaux	1	34.000000	0	34	34
Georgia	Atlanta	1	55.000000	0	55	55
England	Cambridge	1	37.000000	0	37	37
California	San Francisco	2	21.000000	10	16	26
Alberta	Calgary	5	35.000000	53	9	62

		EmployeeID	VacationHours
		NumEmployees	AverageVacationHours	DeltaVacationHours	MinVacationHours	MaxVacationHours
StateProvinceName	City
Oregon	Portland	1	22.000000	0	22	22
England	Cambridge	1	37.000000	0	37	37
Georgia	Atlanta	1	55.000000	0	55	55
Gironde	Bordeaux	1	34.000000	0	34	34
Hamburg	Berlin	1	35.000000	0	35	35
Massachusetts	Cambridge	1	39.000000	0	39	39
Michigan	Detroit	1	38.000000	0	38	38
Minnesota	Duluth	1	24.000000	0	24	24
Minnesota	Minneapolis	1	48.000000	0	48	48
Ontario	Ottawa	1	31.000000	0	31	31
Washington	Kent	1	5.000000	0	5	5
Tennessee	Memphis	1	29.000000	0	29	29
Utah	Nevada	1	27.000000	0	27	27
Victoria	Melbourne	1	36.000000	0	36	36
California	San Francisco	2	21.000000	10	16	26
Washington	Index	5	76.000000	32	59	91
Washington	Gold Bar	5	70.800000	39	53	92
Alberta	Calgary	5	35.000000	53	9	62
Washington	Newport Hills	6	62.833333	55	44	99
	Carnation	5	59.000000	64	24	88
	Monroe	12	47.916667	71	14	85
	Everett	17	57.823529	71	27	98
	Issaquah	15	67.666667	73	10	83
	Snohomish	10	52.800000	85	10	95
	Seattle	44	54.659091	89	6	95
	Kenmore	12	53.916667	90	7	97
	Bothell	13	65.692308	90	9	99
	Duvall	10	42.200000	92	0	92
	Redmond	21	37.428571	92	2	94
	Sammamish	17	42.823529	93	0	93
	Bellevue	36	44.305556	95	1	96
	Renton	17	54.176471	97	0	97
	Edmonds	25	47.640000	97	1	98