Querying hierarchical data with Rabbit

We reproduce the queries from the section on querying hierarchical data using Rabbit syntax.

The database

We start with loading a sample database. Our sample database is derived from the dataset of all employees of the city of Chicago (source).


In [1]:
include("../citydb.jl")

using RBT
setdb(citydb)


Out[1]:
Department:
  name :: UTF8String # unique
  employee (inverse of Employee.department) :: Array{Employee,1} # unique, covering
Employee:
  name :: UTF8String
  surname :: UTF8String
  position :: UTF8String
  salary :: Int64
  department :: Department
  managed_by :: Nullable{Employee}
  manages (inverse of Employee.managed_by) :: Array{Employee,1} # unique

We can execute a query using @query() command:


In [2]:
@query(6*(3+4))


Out[2]:
42

Traversing the hierarchy

Find the names of all departments.


In [3]:
@query(department.name)


Out[3]:
35-element Array{UTF8String,1}:
 "WATER MGMNT"      
 "POLICE"           
 "GENERAL SERVICES" 
 "CITY COUNCIL"     
 "STREETS & SAN"    
 ⋮                  
 "BOARD OF ETHICS"  
 "POLICE BOARD"     
 "BUDGET & MGMT"    
 "ADMIN HEARNG"     
 "LICENSE APPL COMM"

Find the names of all employees.


In [4]:
@query(department.employee.name)


Out[4]:
32181-element Array{UTF8String,1}:
 "ELVIA"     
 "VICENTE"   
 "MUHAMMAD"  
 "GIRLEY"    
 "DILAN"     
 ⋮           
 "NANCY"     
 "DARCI"     
 "THADDEUS"  
 "RACHENETTE"
 "MICHELLE"  

We are not longer restricted by the hierarchical structure of the database, so we can query employees directly.


In [5]:
@query(employee.name)


Out[5]:
32181-element Array{UTF8String,1}:
 "ELVIA"    
 "JEFFERY"  
 "KARINA"   
 "KIMBERLEI"
 "VICENTE"  
 ⋮          
 "MICHAEL"  
 "PETER"    
 "MARK"     
 "CARLO"    
 "DARIUSZ"  

If the traversal ends at an entity class, an array of records is generated.


In [6]:
@query(employee)


Out[6]:
32181-element Array{Tuple{UTF8String,UTF8String,UTF8String,UTF8String,Int64},1}:
 ("ELVIA","A","WATER MGMNT","WATER RATE TAKER",88968)                     
 ("JEFFERY","A","POLICE","POLICE OFFICER",80778)                          
 ("KARINA","A","POLICE","POLICE OFFICER",80778)                           
 ("KIMBERLEI","A","GENERAL SERVICES","CHIEF CONTRACT EXPEDITER",84780)    
 ("VICENTE","A","WATER MGMNT","CIVIL ENGINEER IV",104736)                 
 ⋮                                                                        
 ("MICHAEL","Z","GENERAL SERVICES","FRM OF MACHINISTS - AUTOMOTIVE",97448)
 ("PETER","Z","POLICE","POLICE OFFICER",86520)                            
 ("MARK","Z","POLICE","POLICE OFFICER",83616)                             
 ("CARLO","Z","POLICE","POLICE OFFICER",86520)                            
 ("DARIUSZ","Z","DoIT","CHIEF DATA BASE ANALYST",110352)                  

Summarizing data

Find the number of departments.


In [7]:
@query(count(department))


Out[7]:
35

Find the number of employees for each department.


In [8]:
@query(department.count(employee))


Out[8]:
35-element Array{Int64,1}:
  1848
 13570
   924
   397
  2090
     ⋮
     9
     2
    43
    39
     1

Find the total number of employees.


In [9]:
@query(count(department.employee))


Out[9]:
32181

Again, we can query employee directly.


In [10]:
@query(count(employee))


Out[10]:
32181

Find the top salary among all employees.


In [11]:
@query(max(employee.salary))


Out[11]:
Nullable(260004)

Find the maximum number of employees per department.


In [12]:
@query(max(department.count(employee)))


Out[12]:
Nullable(13570)

Tabular output

For each department, find the number of employees.


In [13]:
@query(department:select(name,count(employee)))


Out[13]:
35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",1848)    
 ("POLICE",13570)        
 ("GENERAL SERVICES",924)
 ("CITY COUNCIL",397)    
 ("STREETS & SAN",2090)  
 ⋮                       
 ("BOARD OF ETHICS",9)   
 ("POLICE BOARD",2)      
 ("BUDGET & MGMT",43)    
 ("ADMIN HEARNG",39)     
 ("LICENSE APPL COMM",1) 

It is easy to add new columns to the output. Let us add the top salary per department.


In [14]:
@query(
    department
    :select(
        name,
        count(employee),
        max(employee.salary)))


Out[14]:
35-element Array{Tuple{UTF8String,Int64,Nullable{Int64}},1}:
 ("WATER MGMNT",1848,Nullable(169512))    
 ("POLICE",13570,Nullable(260004))        
 ("GENERAL SERVICES",924,Nullable(157092))
 ("CITY COUNCIL",397,Nullable(160248))    
 ("STREETS & SAN",2090,Nullable(157092))  
 ⋮                                        
 ("BOARD OF ETHICS",9,Nullable(131688))   
 ("POLICE BOARD",2,Nullable(97728))       
 ("BUDGET & MGMT",43,Nullable(169992))    
 ("ADMIN HEARNG",39,Nullable(156420))     
 ("LICENSE APPL COMM",1,Nullable(69888))  

Filtering data

Find the employees with salary greater than $200k.


In [15]:
@query(
    employee
    :filter(salary>200000))


Out[15]:
3-element Array{Tuple{UTF8String,UTF8String,UTF8String,UTF8String,Int64},1}:
 ("RAHM","E","MAYOR'S OFFICE","MAYOR",216210)            
 ("GARRY","M","POLICE","SUPERINTENDENT OF POLICE",260004)
 ("JOSE","S","FIRE","FIRE COMMISSIONER",202728)          

Find the departments with more than 1000 employees.


In [16]:
@query(
    department
    :filter(count(employee)>1000)
    :select(name, count(employee)))


Out[16]:
7-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",1848)  
 ("POLICE",13570)      
 ("STREETS & SAN",2090)
 ("AVIATION",1344)     
 ("FIRE",4875)         
 ("OEMC",1135)         
 ("TRANSPORTN",1200)   

Find the number of departments with more than 1000 employees.


In [17]:
@query(
    count(
        department
        :filter(count(employee)>1000)))


Out[17]:
7

For each department, find the number of employees with salary higher than $100k.


In [18]:
@query(
    department
    :select(
        name,
        count(employee:filter(salary>100000))))


Out[18]:
35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",179)    
 ("POLICE",1493)        
 ("GENERAL SERVICES",79)
 ("CITY COUNCIL",54)    
 ("STREETS & SAN",39)   
 ⋮                      
 ("BOARD OF ETHICS",2)  
 ("POLICE BOARD",0)     
 ("BUDGET & MGMT",12)   
 ("ADMIN HEARNG",3)     
 ("LICENSE APPL COMM",0)