In [ ]:
(ql:quickload :cl-hamt)

In [ ]:
(use-package :cl-hamt)

cl-hamt

This notebook demonstrates how to use the Common Lisp library cl-hamt. cl-hamt provides data structures for sets and dictionaries based on hash array-mapped tries (HAMTs). This data type provides insertion, deletion, and lookup of entries in a collection of size $n$ in $\log_{32}(n)$ time. The space usage is asymptotic to $n\cdot\log_{32}(n)$ + some constant overhead. For large collections, this overhead is a small fraction of the whole, but if you're working with a large number of small-size collections, a list or an association list might be more appropriate. The dictionary data type in the Clojure programming language's standard library is implemented using HAMTs.

The implementation of HAMTs in this library is fully persistent. A persistent collection is never truly modified; rather, when one wishes to add an entry to a collection, a new collection is returned which contains the new element and the old, unmodified collection is preserved. The new augmented collection, however, shares as much structure as possible with the old collection. The garbage collector cleans up any old versions of data structures for us if we're not using them anymore. All told, these persistent collections don't use much more memory than their imperative counterparts. Since persistent collections are fundamentally immutable and one never makes destructive updates, they can be much easier to reason about and debug.

Hash sets

Some basic usage -- adding, removing entries -- of the set API is shown below. The function empty-set creates a set with no entries in it; to populate it, call the function set-insert.


In [ ]:
(empty-set)

In [ ]:
(defvar musicians
    (set-insert (empty-set)
                "Miles Davis"
                "John Coltrane"
                "Charlie Parker"
                "Dizzy Gillespie"
                "Mary Lou Williams"
                "Ella Fitzgerald"
                "Nina Simone"
                "Jascha Heifetz"
                "Hilary Hahn"
                "Stefan Grappelli"
                "Chubby Wise"
                "Lester Flatt"
                "Earl Scruggs"
                "Bill Monroe"))

There are two query operations we can perform on sets: getting the size of the set, and returning whether or not some object is contained in the set.


In [ ]:
(set-size musicians)

In [ ]:
(set-lookup musicians "Nina Simone")

In [ ]:
(set-lookup musicians "Teo Macero")

Adding an entry to a set creates a new set; it does not modify the old one.


In [ ]:
(defvar even-more-musicians (set-insert musicians "Birgit Nilsson"))

In [ ]:
(set-lookup musicians "Birgit Nilsson")

In [ ]:
(set-lookup even-more-musicians "Birgit Nilsson")

Reducing over a collection is the key means of performing an operation on all of its elements. The code below starts with an empty string, then concatenates the name of every musician in the set with a newline after.


In [ ]:
(set-reduce (lambda (name str)
                    (concatenate 'string str (string #\linefeed) name))
            musicians
            "")

Note that HAMTs store the input by hashing it. Hashing does not preserve any natural ordering (e.g. lexicographic) of the entries. Nor does it preserve the order in which they were inserted.

Hash maps

Dictionaries work just like sets do and the names of all the API functions are the same. The main difference is that, when inserting entries into a dictionary, the arguments come in alternating key-value pairs.


In [ ]:
(defvar musicians->instruments
    (dict-insert (empty-dict)
                 "Miles Davis" "trumpet"
                 "John Coltrane" "tenor sax"
                 "Charlie Parker" "alto sax"
                 "Dizzy Gillespie" "trumpet"
                 "Mary Lou Williams" "piano"
                 "Ella Fitzgerald" "voice"
                 "Nina Simone" "voice"
                 "Jascha Heifetz" "violin"
                 "Hilary Hahn" "violin"
                 "Stefan Grappelli" "violin"
                 "Chubby Wise" "violin"
                 "Lester Flatt" "guitar"
                 "Earl Scruggs" "banjo"
                 "Bill Monroe" "mandolin"))

Rather than returning merely true or false when looking up whether an entry is in a dictionary, the function dict-lookup returns the value that this key is mapped to.


In [ ]:
(dict-lookup musicians->instruments "Dizzy Gillespie")

Looking up a key that is not in the dictionary will return nil. Consequently if you were to store a key that actually is mapped to nil, this would be indistinguishable from the key not being in the dictionary at all.


In [ ]:
(dict-lookup musicians->instruments "Karl Rove")

The same operations (filter and reduce) are defined for dictionaries as for sets, but the signature of the filtering or reducing function you pass to it takes in both the key and the value. The code below iterates over the dictionary and creates a list of the names of all the violinists.


In [ ]:
(dict-reduce (lambda (violinists name instrument)
                (if (equal instrument "violin")
                   (cons name violinists)
                   violinists))
             musicians->instruments
             '())

For dictionaries, there are specialized versions of reduce for operating only on the dictionary keys or values. The function below counts the number of violinists rather than assembling all of them into a list.


In [ ]:
(dict-reduce-values (lambda (count instrument)
                      (+ count (if (equal instrument "violin") 1 0)))
                    musicians->instruments
                    0)