Risk-limiting audit code for Proportional Representation via Highest Averages

Code and demo from Verifiable European Elections: Risk-limiting Audits for D’Hondt and its relatives by Philip B. Stark and Vanessa Teague, March 26, 2015


  • dHondt(partyTotals, seats, divisors)
  • uMax(win, lose)
  • minSampleSize(ballots, u, gamma=0.95, alpha=0.1)

Demonstrated on Denmark's 2014 European Union Parliamentary election which uses an open list proportional representation voting method, with seats allocated via the highest averages method.

Parties can form coalitions, in which case first seats are allocated across the coalitions, and then from the the seats for each coalition, the parties within the coalition are allocated seats.

Works with both Python 2.7 and Python 3

How the ballots look and are marked (for party, for candidate), from Altinget.dk: This is how the European Parliament will change EU elections

In [1]:
from __future__ import division
from __future__ import print_function

import math
import numpy as np

def dHondt(partyTotals, seats, divisors):
    allocate <seats> seats to parties according to <partyTotals> votes,
    using D'Hondt proportional allocation with <weights> divisors
        partyTotals: list of total votes by party
        seats:       total number of seats to allocate
        divisors:    divisors for proportional allocation. For d'Hondt, divisors are 1, 2, 3, ...
        partySeats:  list of number of seats for each party
        seated:      list of tuples--parties with at least one seat, 
                     number of votes that party got, 
                     and divisor for last seated in the party
        notSeated:   list of tuples--parties with at least one lost seat, 
                     number of votes that party got,
                     and divisor for the first non-seated in the party
        pseudoCandidates: matrix of votes for each pseudocandidate
    pseudoCandidates = np.array([partyTotals,]*seats, ).T/divisors.astype(float)
    sortedPC = np.sort(np.ravel(pseudoCandidates))
    lastSeated = sortedPC[-seats]
    theSeated = np.where(pseudoCandidates >= lastSeated)
    partySeats = np.bincount(theSeated[0], minlength=len(partyTotals)) # number of seats for each party
    inx = np.nonzero(partySeats)[0]  # only those with at least one seat
    seated = list(zip(inx, partyTotals[inx], divisors[partySeats[inx]-1]))
                                           # parties with at least one seat, 
                                           # number of votes that party got,
                                           # and divisor for last seated in the party
    theNotSeated = np.where(pseudoCandidates < lastSeated)
    partyNotSeats = np.bincount(theNotSeated[0], minlength=len(partyTotals)) # number of non-seats for each party
    inx = np.nonzero(partyNotSeats)[0]
    notSeated = list(zip(inx, partyTotals[inx], divisors[partySeats[inx]]))
                                           # parties with at least one unseated, 
                                           # number of votes that party got,
                                           # and divisor for the first non-seated in the party
    if (lastSeated == sortedPC[-(seats+1)]):
        raise ValueError("Tied contest for the last seat!")
        return partySeats, seated, notSeated, lastSeated, pseudoCandidates
def uMax(win, lose):  
    finds the upper bound u on the MICRO for the contest
    win and lose are lists of triples: [party, tally(party), divisor]
    the divisor for win is the largest divisor for any seat the party won
    the divisor for lose is the smallest divisor for any seat the party lost
    See Stark and Teague, 2014, equation 7.
        win:  list of triples--party, tally(party), divisor
        lose: list of triples--party, tally(party), divisor
        maximum possible relative overstatement for any ballot
    new_u = 0.0
    u = new_u
    for w in win:
        for ell in lose:
            if w[0] != ell[0]:
                new_u = (ell[2] + w[2]) / (ell[2]*w[1] - w[2]*ell[1])
                u = max(u, new_u)
                # print "%s," % ((round(u, 7), round(new_u, 7), w, ell),)  # u: %.4g, new_u: %.4g, winner: %s, loser: %s
    return u

def minSampleSize(ballots, u, gamma=0.95, alpha=0.1):
    find smallest sample size for ballot-level comparison audit
    using risk-limit alpha, and cushion gamma \in (0,1)

    1/alpha = (gamma/(1-1/(ballots*u))+1-gamma)**n
        ballots: number of ballots cast in the contest
        u:       upper bound on overstatement per ballot
        gamma:   hedge against finding a ballot that attains the upper bound. Larger values give
                 less protection
        alpha:   risk limit
    return math.ceil(math.log(1.0/alpha) / math.log(gamma/(1.0-1.0/(ballots*u)) + 1.0 - gamma))

Re-tally election results

In [2]:
# Final 2014 Danish EU Parliamentary election results from http://www.dst.dk/valg/Valg1475795/valgopg/valgopgHL.htm
# There were two coalitions: (A,B,F) and (C,V)
#  Official results by party
A = 435245
B = 148949
C = 208262
F = 249305
I = 65480
N = 183724
O = 605889
V = 379840
Ballots = 2332217  # includes invalid and blank ballots
nSeats = 13  # seats to allocate
# allocate seats to coalitions
coalitionTotals = np.array([A+B+F, C+V, I, N, O])  # for coalitions
coalitionSeats, coalitionSeated, coalitionNotSeated, coalitionLastSeated, coalitionPCs = dHondt(coalitionTotals, nSeats, np.arange(1, nSeats+1))
print('A+B+F, C+V, I, N, O:', coalitionSeats)
# allocate seats within coalitions
nABFSeats = coalitionSeats[0]
nCVSeats = coalitionSeats[1]
ABFSeats, ABFSeated, ABFNotSeated, ABFLastSeated, ABFPCs = dHondt(np.array([A, B, F]), nABFSeats, np.arange(1, nABFSeats+1))
CVSeats, CVSeated, CVNotSeated, CVLastSeated, CVPCs = dHondt(np.array([C, V]), nCVSeats, np.arange(1, nCVSeats+1))
print('A, B, F:', ABFSeats, ';  C, V:', CVSeats)
ASeats = ABFSeats[0]
BSeats = ABFSeats[1]
CSeats = CVSeats[0]
FSeats = ABFSeats[2]
ISeats = coalitionSeats[2]
NSeats = coalitionSeats[3]
OSeats = coalitionSeats[4]
VSeats = CVSeats[1]
allSeats = [ASeats, BSeats, CSeats, FSeats, ISeats, NSeats, OSeats, VSeats]

print('---------------\nSeats to parties A, B, C, F, I, N, O, V: ', allSeats)
print('Seated coalitions, votes, divisor:', coalitionSeated)
print('Non-Seated coalitions, votes, divisor:', coalitionNotSeated)

A+B+F, C+V, I, N, O: [5 3 0 1 4]
A, B, F: [3 1 1] ;  C, V: [1 2]
Seats to parties A, B, C, F, I, N, O, V:  [3, 1, 1, 1, 0, 1, 4, 2]
Seated coalitions, votes, divisor: [(0, 833499, 5), (1, 588102, 3), (3, 183724, 1), (4, 605889, 4)]
Non-Seated coalitions, votes, divisor: [(0, 833499, 6), (1, 588102, 4), (2, 65480, 1), (3, 183724, 2), (4, 605889, 5)]

Audit: initial sample size

In [3]:
gamma = 0.95  # tuning constant in the Kaplan-Wald method
alpha = 0.001 # risk limit

u = uMax(coalitionSeated, coalitionNotSeated)

print("Minimum ballot-level comparison sample size = %d\n for max total overstatement = %.2f, u = %.4g, gamma = %.2f, alpha = %.3f" %
      (minSampleSize(Ballots, u, gamma, alpha), Ballots*u, u, gamma, alpha))

Minimum ballot-level comparison sample size = 1903
 for max total overstatement = 262.24, u = 0.0001124, gamma = 0.95, alpha = 0.001