Propositional Logic


Improving Boolean Satisfiability Algorithms

Introduction

A propositional formula $\Phi$ in Conjunctive Normal Form (CNF) is a conjunction of clauses $\omega_j$, with $j \in \{1,...,m\}$. Each clause being a disjunction of literals and each literal being either a positive ($x_i$) or a negative ($\lnot{x_i}$) propositional variable, with $i \in \{1,...,n\}$. By denoting with $[\lnot]$ the possible presence of $\lnot$, we can formally define $\Phi$ as:

$$\bigwedge_{j = 1,...,m}\bigg(\bigvee_{i \in \omega_j} [\lnot] x_i\bigg)$$

The Boolean Satisfiability Problem (SAT) consists in determining whether there exists a truth assignment in $\{0, 1\}$ (or equivalently in $\{True,False\}$) for the variables in $\Phi$.


In [1]:
from logic import *

DPLL with Branching Heuristics

The Davis-Putnam-Logemann-Loveland (DPLL) algorithm is a complete (will answer SAT if a solution exists) and sound (it will not answer SAT for an unsatisfiable formula) procedue that combines backtracking search and deduction to decide satisfiability of propositional logic formula in CNF. At each search step a variable and a propositional value are selected for branching purposes. With each branching step, two values can be assigned to a variable, either 0 or 1. Branching corresponds to assigning the chosen value to the chosen variable. Afterwards, the logical consequences of each branching step are evaluated. Each time an unsatisfied clause (ie a conflict) is identified, backtracking is executed. Backtracking corresponds to undoing branching steps until an unflipped branch is reached. When both values have been assigned to the selected variable at a branching step, backtracking will undo this branching step. If for the first branching step both values have been considered, and backtracking undoes this first branching step, then the CNF formula can be declared unsatisfiable. This kind of backtracking is called chronological backtracking.

Essentially, DPLL is a backtracking depth-first search through partial truth assignments which uses a splitting rule to replaces the original problem with two smaller subproblems, whereas the original Davis-Putnam procedure uses a variable elimination rule which replaces the original problem with one larger subproblem. Over the years, many heuristics have been proposed in choosing the splitting variable (which variable should be assigned a truth value next).

Search algorithms that are based on a predetermined order of search are called static algorithms, whereas the ones that select them at the runtime are called dynamic. The first SAT search algorithm, the Davis-Putnam procedure is a static algorithm. Static search algorithms are usually very slow in practice and for this reason perform worse than dynamic search algorithms. However, dynamic search algorithms are much harder to design, since they require a heuristic for predetermining the order of search. The fundamental element of a heuristic is a branching strategy for selecting the next branching literal. This must not require a lot of time to compute and yet it must provide a powerful insight into the problem instance.

Two basic heuristics are applied to this algorithm with the potential of cutting the search space in half. These are the pure literal rule and the unit clause rule.

  • the pure literal rule is applied whenever a variable appears with a single polarity in all the unsatisfied clauses. In this case, assigning a truth value to the variable so that all the involved clauses are satisfied is highly effective in the search;
  • if some variable occurs in the current formula in a clause of length 1 then the unit clause rule is applied. Here, the literal is selected and a truth value so the respective clause is satisfied is assigned. The iterative application of the unit rule is commonly reffered to as Boolean Constraint Propagation (BCP).

In [2]:
%psource dpll_satisfiable


def dpll_satisfiable(s, branching_heuristic=no_branching_heuristic):
    """Check satisfiability of a propositional sentence.
    This differs from the book code in two ways: (1) it returns a model
    rather than True when it succeeds; this is more useful. (2) The
    function find_pure_symbol is passed a list of unknown clauses, rather
    than a list of all clauses and the model; this is more efficient.
    >>> dpll_satisfiable(A |'<=>'| B) == {A: True, B: True}
    True
    """
    return dpll(conjuncts(to_cnf(s)), prop_symbols(s), {}, branching_heuristic)

In [3]:
%psource dpll


def dpll(clauses, symbols, model, branching_heuristic=no_branching_heuristic):
    """See if the clauses are true in a partial model."""
    unknown_clauses = []  # clauses with an unknown truth value
    for c in clauses:
        val = pl_true(c, model)
        if val is False:
            return False
        if val is None:
            unknown_clauses.append(c)
    if not unknown_clauses:
        return model
    P, value = find_pure_symbol(symbols, unknown_clauses)
    if P:
        return dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic)
    P, value = find_unit_clause(clauses, model)
    if P:
        return dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic)
    P, value = branching_heuristic(symbols, unknown_clauses)
    return (dpll(clauses, remove_all(P, symbols), extend(model, P, value), branching_heuristic) or
            dpll(clauses, remove_all(P, symbols), extend(model, P, not value), branching_heuristic))

Each of these branching heuristics was applied only after the pure literal and the unit clause heuristic failed in selecting a splitting variable.

MOMs

MOMs heuristics are simple, efficient and easy to implement. The goal of these heuristics is to prefer the literal having Maximum number of Occurences in the Minimum length clauses. Intuitively, the literals belonging to the minimum length clauses are the most constrained literals in the formula. Branching on them will maximize the effect of BCP and the likelihood of hitting a dead end early in the search tree (for unsatisfiable problems). Conversely, in the case of satisfiable formulas, branching on a highly constrained variable early in the tree will also increase the likelihood of a correct assignment of the remained open literals. The MOMs heuristics main disadvatage is that their effectiveness highly depends on the problem instance. It is easy to see that the ideal setting for these heuristics is considering the unsatisfied binary clauses.


In [4]:
%psource min_clauses


def min_clauses(clauses):
    min_len = min(map(lambda c: len(c.args), clauses), default=2)
    return filter(lambda c: len(c.args) == (min_len if min_len > 1 else 2), clauses)

In [5]:
%psource moms


def moms(symbols, clauses):
    """
    MOMS (Maximum Occurrence in clauses of Minimum Size) heuristic
    Returns the literal with the most occurrences in all clauses of minimum size
    """
    scores = Counter(l for c in min_clauses(clauses) for l in prop_symbols(c))
    return max(symbols, key=lambda symbol: scores[symbol]), True

Over the years, many types of MOMs heuristics have been proposed.

MOMSf choose the variable $x$ with a maximize the function:

$$[f(x) + f(\lnot{x})] * 2^k + f(x) * f(\lnot{x})$$

where $f(x)$ is the number of occurrences of $x$ in the smallest unknown clauses, k is a parameter.


In [6]:
%psource momsf


def momsf(symbols, clauses, k=0):
    """
    MOMS alternative heuristic
    If f(x) the number of occurrences of the variable x in clauses with minimum size,
    we choose the variable maximizing [f(x) + f(-x)] * 2^k + f(x) * f(-x)
    Returns x if f(x) >= f(-x) otherwise -x
    """
    scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c))
    P = max(symbols,
            key=lambda symbol: (scores[symbol] + scores[~symbol]) * pow(2, k) + scores[symbol] * scores[~symbol])
    return P, True if scores[P] >= scores[~P] else False

Freeman’s POSIT [1] version counts both the number of positive $x$ and negative $\lnot{x}$ occurrences of a given variable $x$.


In [7]:
%psource posit


def posit(symbols, clauses):
    """
    Freeman's POSIT version of MOMs
    Counts the positive x and negative x for each variable x in clauses with minimum size
    Returns x if f(x) >= f(-x) otherwise -x
    """
    scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c))
    P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
    return P, True if scores[P] >= scores[~P] else False

Zabih and McAllester’s [2] version of the heuristic counts the negative occurrences $\lnot{x}$ of each given variable $x$.


In [8]:
%psource zm


def zm(symbols, clauses):
    """
    Zabih and McAllester's version of MOMs
    Counts the negative occurrences only of each variable x in clauses with minimum size
    """
    scores = Counter(l for c in min_clauses(clauses) for l in disjuncts(c) if l.op == '~')
    return max(symbols, key=lambda symbol: scores[~symbol]), True

DLIS & DLCS

Literal count heuristics count the number of unresolved clauses in which a given variable $x$ appears as a positive literal, $C_P$ , and as negative literal, $C_N$. These two numbers an either be onsidered individually or ombined.

Dynamic Largest Individual Sum heuristic considers the values $C_P$ and $C_N$ separately: select the variable with the largest individual value and assign to it value true if $C_P \geq C_N$, value false otherwise.


In [9]:
%psource dlis


def dlis(symbols, clauses):
    """
    DLIS (Dynamic Largest Individual Sum) heuristic
    Choose the variable and value that satisfies the maximum number of unsatisfied clauses
    Like DLCS but we only consider the literal (thus Cp and Cn are individual)
    """
    scores = Counter(l for c in clauses for l in disjuncts(c))
    P = max(symbols, key=lambda symbol: scores[symbol])
    return P, True if scores[P] >= scores[~P] else False

Dynamic Largest Combined Sum considers the values $C_P$ and $C_N$ combined: select the variable with the largest sum $C_P + C_N$ and assign to it value true if $C_P \geq C_N$, value false otherwise.


In [10]:
%psource dlcs


def dlcs(symbols, clauses):
    """
    DLCS (Dynamic Largest Combined Sum) heuristic
    Cp the number of clauses containing literal x
    Cn the number of clauses containing literal -x
    Here we select the variable maximizing Cp + Cn
    Returns x if Cp >= Cn otherwise -x
    """
    scores = Counter(l for c in clauses for l in disjuncts(c))
    P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
    return P, True if scores[P] >= scores[~P] else False

JW & JW2

Two branching heuristics were proposed by Jeroslow and Wang in [3].

The one-sided Jeroslow and Wang’s heuristic compute:

$$J(l) = \sum_{l \in \omega \land \omega \in \phi} 2^{-|\omega|}$$

and selects the assignment that satisfies the literal with the largest value $J(l)$.


In [11]:
%psource jw


def jw(symbols, clauses):
    """
    Jeroslow-Wang heuristic
    For each literal compute J(l) = \sum{l in clause c} 2^{-|c|}
    Return the literal maximizing J
    """
    scores = Counter()
    for c in clauses:
        for l in prop_symbols(c):
            scores[l] += pow(2, -len(c.args))
    return max(symbols, key=lambda symbol: scores[symbol]), True

The two-sided Jeroslow and Wang’s heuristic identifies the variable $x$ with the largest sum $J(x) + J(\lnot{x})$, and assigns to $x$ value true, if $J(x) \geq J(\lnot{x})$, and value false otherwise.


In [12]:
%psource jw2


def jw2(symbols, clauses):
    """
    Two Sided Jeroslow-Wang heuristic
    Compute J(l) also counts the negation of l = J(x) + J(-x)
    Returns x if J(x) >= J(-x) otherwise -x
    """
    scores = Counter()
    for c in clauses:
        for l in disjuncts(c):
            scores[l] += pow(2, -len(c.args))
    P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
    return P, True if scores[P] >= scores[~P] else False

CDCL with 1UIP Learning Scheme, 2WL Lazy Data Structure, VSIDS Branching Heuristic & Restarts

The Conflict-Driven Clause Learning (CDCL) solver is an evolution of the DPLL algorithm that involves a number of additional key techniques:

  • non-chronological backtracking or backjumping;
  • learning new clauses from conflicts during search by exploiting its structure;
  • using lazy data structures for storing clauses;
  • branching heuristics with low computational overhead and which receive feedback from search;
  • periodically restarting search.

The first difference between a DPLL solver and a CDCL solver is the introduction of the non-chronological backtracking or backjumping when a conflict is identified. This requires an iterative implementation of the algorithm because only if the backtrack stack is managed explicitly it is possible to backtrack more than one level.


In [4]:
%psource cdcl_satisfiable


def cdcl_satisfiable(s, vsids_decay=0.95, restart_strategy=no_restart):
    """
    >>> cdcl_satisfiable(A |'<=>'| B) == {A: True, B: True}
    True
    """
    clauses = TwoWLClauseDatabase(conjuncts(to_cnf(s)))
    symbols = prop_symbols(s)
    scores = Counter()
    G = nx.DiGraph()
    model = {}
    dl = 0
    conflicts = 0
    restarts = 1
    sum_lbd = 0
    queue_lbd = []
    while True:
        conflict = unit_propagation(clauses, symbols, model, G, dl)
        if conflict:
            if dl == 0:
                return False
            conflicts += 1
            dl, learn, lbd = conflict_analysis(G, dl)
            queue_lbd.append(lbd)
            sum_lbd += lbd
            backjump(symbols, model, G, dl)
            clauses.add(learn, model)
            scores.update(l for l in disjuncts(learn))
            for symbol in scores:
                scores[symbol] *= vsids_decay
            if restart_strategy(conflicts, restarts, queue_lbd, sum_lbd):
                backjump(symbols, model, G)
                queue_lbd.clear()
                restarts += 1
        else:
            if not symbols:
                return model
            dl += 1
            assign_decision_literal(symbols, model, scores, G, dl)

Clause Learning with 1UIP Scheme

The second important difference between a DPLL solver and a CDCL solver is that the information about a conflict is reused by learning: if a conflicting clause is found, the solver derive a new clause from the conflict and add it to the clauses database.

Whenever a conflict is identified due to unit propagation, a conflict analysis procedure is invoked. As a result, one or more new clauses are learnt, and a backtracking decision level is computed. The conflict analysis procedure analyzes the structure of unit propagation and decides which literals to include in the learnt clause. The decision levels associated with assigned variables define a partial order of the variables. Starting from a given unsatisfied clause (represented in the implication graph with vertex $\kappa$), the conflict analysis procedure visits variables implied at the most recent decision level (ie the current largest decision level), identifies the antecedents of visited variables, and keeps from the antecedents the literals assigned at decision levels less than the most recent decision level. The clause learning procedure used in the CDCL can be defined by a sequence of selective resolution operations, that at each step yields a new temporary clause. This process is repeated until the most recent decision variable is visited.

The structure of implied assignments induced by unit propagation is a key aspect of the clause learning procedure. Moreover, the idea of exploiting the structure induced by unit propagation was further exploited with Unit Implication Points (UIPs). A UIP is a dominator in the implication graph and represents an alternative decision assignment at the current decision level that results in the same conflict. The main motivation for identifying UIPs is to reduce the size of learnt clauses. Clause learning could potentially stop at any UIP, being quite straightforward to conclude that the set of literals of a clause learnt at the first UIP has clear advantages. Considering the largest decision level of the literals of the clause learnt at each UIP, the clause learnt at the first UIP is guaranteed to contain the smallest one. This guarantees the highest backtrack jump in the search tree.


In [5]:
%psource conflict_analysis


def conflict_analysis(G, dl):
    conflict_clause = next(G[p]['K']['antecedent'] for p in G.pred['K'])
    P = next(node for node in G.nodes() - 'K' if G.nodes[node]['dl'] == dl and G.in_degree(node) == 0)
    first_uip = nx.immediate_dominators(G, P)['K']
    G.remove_node('K')
    conflict_side = nx.descendants(G, first_uip)
    while True:
        for l in prop_symbols(conflict_clause).intersection(conflict_side):
            antecedent = next(G[p][l]['antecedent'] for p in G.pred[l])
            conflict_clause = pl_binary_resolution(conflict_clause, antecedent)
            # the literal block distance is calculated by taking the decision levels from variables of all
            # literals in the clause, and counting how many different decision levels were in this set
            lbd = [G.nodes[l]['dl'] for l in prop_symbols(conflict_clause)]
            if lbd.count(dl) == 1 and first_uip in prop_symbols(conflict_clause):
                return 0 if len(lbd) == 1 else heapq.nlargest(2, lbd)[-1], conflict_clause, len(set(lbd))

In [6]:
%psource pl_binary_resolution


def pl_binary_resolution(ci, cj):
    for di in disjuncts(ci):
        for dj in disjuncts(cj):
            if di == ~dj or ~di == dj:
                return pl_binary_resolution(associate('|', remove_all(di, disjuncts(ci))),
                                            associate('|', remove_all(dj, disjuncts(cj))))
    return associate('|', unique(disjuncts(ci) + disjuncts(cj)))

In [7]:
%psource backjump


def backjump(symbols, model, G, dl=0):
    delete = {node for node in G.nodes() if G.nodes[node]['dl'] > dl}
    G.remove_nodes_from(delete)
    for node in delete:
        del model[node]
    symbols |= delete

2WL Lazy Data Structure

Implementation issues for SAT solvers include the design of suitable data structures for storing clauses. The implemented data structures dictate the way BCP are implemented and have a significant impact on the run time performance of the SAT solver. Recent state-of-the-art SAT solvers are characterized by using very efficient data structures, intended to reduce the CPU time required per each node in the search tree. Conversely, traditional SAT data structures are accurate, meaning that is possible to know exactly the value of each literal in the clause. Examples of the most recent SAT data structures, which are not accurate and therefore are called lazy, include the watched literals used in Chaff .

The more recent Chaff SAT solver [4] proposed a new data structure, the 2 Watched Literals (2WL), in which two references are associated with each clause. There is no order relation between the two references, allowing the references to move in any direction. The lack of order between the two references has the key advantage that no literal references need to be updated when backtracking takes place. In contrast, unit or unsatisfied clauses are identified only after traversing all the clauses’ literals; a clear drawback. The two watched literal pointers are undifferentiated as there is no order relation. Again, each time one literal pointed by one of these pointers is assigned, the pointer has to move inwards. These pointers may move in both directions. This causes the whole clause to be traversed when the clause becomes unit. In addition, no references have to be kept to the just assigned literals, since pointers do not move when backtracking.


In [8]:
%psource unit_propagation


def unit_propagation(clauses, symbols, model, G, dl):
    def check(c):
        if not model or clauses.get_first_watched(c) == clauses.get_second_watched(c):
            return True
        w1, _ = inspect_literal(clauses.get_first_watched(c))
        if w1 in model:
            return c in (clauses.get_neg_watched(w1) if model[w1] else clauses.get_pos_watched(w1))
        w2, _ = inspect_literal(clauses.get_second_watched(c))
        if w2 in model:
            return c in (clauses.get_neg_watched(w2) if model[w2] else clauses.get_pos_watched(w2))

    def unit_clause(watching):
        w, p = inspect_literal(watching)
        G.add_node(w, val=p, dl=dl)
        G.add_edges_from(zip(prop_symbols(c) - {w}, itertools.cycle([w])), antecedent=c)
        symbols.remove(w)
        model[w] = p

    def conflict_clause(c):
        G.add_edges_from(zip(prop_symbols(c), itertools.cycle('K')), antecedent=c)

    while True:
        bcp = False
        for c in filter(check, clauses.get_clauses()):
            # we need only visit each clause when one of its two watched literals is assigned to 0 because, until
            # this happens, we can guarantee that there cannot be more than n-2 literals in the clause assigned to 0
            first_watched = pl_true(clauses.get_first_watched(c), model)
            second_watched = pl_true(clauses.get_second_watched(c), model)
            if first_watched is None and clauses.get_first_watched(c) == clauses.get_second_watched(c):
                unit_clause(clauses.get_first_watched(c))
                bcp = True
                break
            elif first_watched is False and second_watched is not True:
                if clauses.update_second_watched(c, model):
                    bcp = True
                else:
                    # if the only literal with a non-zero value is the other watched literal then
                    if second_watched is None:  # if it is free, then the clause is a unit clause
                        unit_clause(clauses.get_second_watched(c))
                        bcp = True
                        break
                    else:  # else (it is False) the clause is a conflict clause
                        conflict_clause(c)
                        return True
            elif second_watched is False and first_watched is not True:
                if clauses.update_first_watched(c, model):
                    bcp = True
                else:
                    # if the only literal with a non-zero value is the other watched literal then
                    if first_watched is None:  # if it is free, then the clause is a unit clause
                        unit_clause(clauses.get_first_watched(c))
                        bcp = True
                        break
                    else:  # else (it is False) the clause is a conflict clause
                        conflict_clause(c)
                        return True
        if not bcp:
            return False

In [9]:
%psource TwoWLClauseDatabase


class TwoWLClauseDatabase:

    def __init__(self, clauses):
        self.__twl = {}
        self.__watch_list = defaultdict(lambda: [set(), set()])
        for c in clauses:
            self.add(c, None)

    def get_clauses(self):
        return self.__twl.keys()

    def set_first_watched(self, clause, new_watching):
        if len(clause.args) > 2:
            self.__twl[clause][0] = new_watching

    def set_second_watched(self, clause, new_watching):
        if len(clause.args) > 2:
            self.__twl[clause][1] = new_watching

    def get_first_watched(self, clause):
        if len(clause.args) == 2:
            return clause.args[0]
        if len(clause.args) > 2:
            return self.__twl[clause][0]
        return clause

    def get_second_watched(self, clause):
        if len(clause.args) == 2:
            return clause.args[-1]
        if len(clause.args) > 2:
            return self.__twl[clause][1]
        return clause

    def get_pos_watched(self, l):
        return self.__watch_list[l][0]

    def get_neg_watched(self, l):
        return self.__watch_list[l][1]

    def add(self, clause, model):
        self.__twl[clause] = self.__assign_watching_literals(clause, model)
        w1, p1 = inspect_literal(self.get_first_watched(clause))
        w2, p2 = inspect_literal(self.get_second_watched(clause))
        self.__watch_list[w1][0].add(clause) if p1 else self.__watch_list[w1][1].add(clause)
        if w1 != w2:
            self.__watch_list[w2][0].add(clause) if p2 else self.__watch_list[w2][1].add(clause)

    def remove(self, clause):
        w1, p1 = inspect_literal(self.get_first_watched(clause))
        w2, p2 = inspect_literal(self.get_second_watched(clause))
        del self.__twl[clause]
        self.__watch_list[w1][0].discard(clause) if p1 else self.__watch_list[w1][1].discard(clause)
        if w1 != w2:
            self.__watch_list[w2][0].discard(clause) if p2 else self.__watch_list[w2][1].discard(clause)

    def update_first_watched(self, clause, model):
        # if a non-zero literal different from the other watched literal is found
        found, new_watching = self.__find_new_watching_literal(clause, self.get_first_watched(clause), model)
        if found:  # then it will replace the watched literal
            w, p = inspect_literal(self.get_second_watched(clause))
            self.__watch_list[w][0].remove(clause) if p else self.__watch_list[w][1].remove(clause)
            self.set_second_watched(clause, new_watching)
            w, p = inspect_literal(new_watching)
            self.__watch_list[w][0].add(clause) if p else self.__watch_list[w][1].add(clause)
            return True

    def update_second_watched(self, clause, model):
        # if a non-zero literal different from the other watched literal is found
        found, new_watching = self.__find_new_watching_literal(clause, self.get_second_watched(clause), model)
        if found:  # then it will replace the watched literal
            w, p = inspect_literal(self.get_first_watched(clause))
            self.__watch_list[w][0].remove(clause) if p else self.__watch_list[w][1].remove(clause)
            self.set_first_watched(clause, new_watching)
            w, p = inspect_literal(new_watching)
            self.__watch_list[w][0].add(clause) if p else self.__watch_list[w][1].add(clause)
            return True

    def __find_new_watching_literal(self, clause, other_watched, model):
        # if a non-zero literal different from the other watched literal is found
        if len(clause.args) > 2:
            for l in disjuncts(clause):
                if l != other_watched and pl_true(l, model) is not False:
                    # then it is returned
                    return True, l
        return False, None

    def __assign_watching_literals(self, clause, model=None):
        if len(clause.args) > 2:
            if model is None or not model:
                return [clause.args[0], clause.args[-1]]
            else:
                return [next(l for l in disjuncts(clause) if pl_true(l, model) is None),
                        next(l for l in disjuncts(clause) if pl_true(l, model) is False)]

VSIDS Branching Heuristic

The early branching heuristics made use of all the information available from the data structures, namely the number of satisfied, unsatisfied and unassigned literals. These heuristics are updated during the search and also take into account the clauses that are learnt.

More recently, a different kind of variable selection heuristic, referred to as Variable State Independent Decaying Sum (VSIDS), has been proposed by Chaff authors in [4]. One of the reasons for proposing this new heuristic was the introduction of lazy data structures, where the knowledge of the dynamic size of a clause is not accurate. Hence, the heuristics described above cannot be used. VSIDS selects the literal that appears most frequently over all the clauses, which means that one counter is required for each one of the literals. Initially, all counters are set to zero. During the search, the metrics only have to be updated when a new recorded clause is created. More than to develop an accurate heuristic, the motivation has been to design a fast (but dynamically adapting) heuristic. In fact, one of the key properties of this strategy is the very low overhead, due to being independent of the variable state.


In [10]:
%psource assign_decision_literal


def assign_decision_literal(symbols, model, scores, G, dl):
    P = max(symbols, key=lambda symbol: scores[symbol] + scores[~symbol])
    value = True if scores[P] >= scores[~P] else False
    symbols.remove(P)
    model[P] = value
    G.add_node(P, val=value, dl=dl)

Restarts

Solving NP-complete problems, such as SAT, naturally leads to heavy-tailed run times. To deal with this, SAT solvers frequently restart their search to avoid the runs that take disproportionately longer. What restarting here means is that the solver unsets all variables and starts the search using different variable assignment order.

While at first glance it might seem that restarts should be rare and become rarer as the solving has been going on for longer, so that the SAT solver can actually finish solving the problem, the trend has been towards more aggressive (frequent) restarts.

The reason why frequent restarts help solve problems faster is that while the solver does forget all current variable assignments, it does keep some information, specifically it keeps learnt clauses, effectively sampling the search space, and it keeps the last assigned truth value of each variable, assigning them the same value the next time they are picked to be assigned.

Luby

In this strategy, the number of conflicts between 2 restarts is based on the Luby sequence. The Luby restart sequence is interesting in that it was proven to be optimal restart strategy for randomized search algorithms where the runs do not share information. While this is not true for SAT solving, as shown in [5] and [6], Luby restarts have been quite successful anyway.

The exact description of Luby restarts is that the $ith$ restart happens after $u \cdot Luby(i)$ conflicts, where $u$ is a constant and $Luby(i)$ is defined as:

$$Luby(i) = \begin{cases} 2^{k-1} & i = 2^k - 1 \\ Luby(i - 2^{k-1} + 1) & 2^{k-1} \leq i < 2^k - 1 \end{cases} $$

A less exact but more intuitive description of the Luby sequence is that all numbers in it are powers of two, and after a number is seen for the second time, the next number is twice as big. The following are the first 16 numbers in the sequence:

$$ (1,1,2,1,1,2,4,1,1,2,1,1,2,4,8,1,...) $$

From the above, we can see that this restart strategy tends towards frequent restarts, but some runs are kept running for much longer, and there is no upper limit on the longest possible time between two restarts.


In [9]:
%psource luby


def luby(conflicts, restarts, queue_lbd, sum_lbd, unit=512):
    # in the state-of-art tested with unit value 1, 2, 4, 6, 8, 12, 16, 32, 64, 128, 256 and 512
    def _luby(i):
        k = 1
        while True:
            if i == (1 << k) - 1:
                return 1 << (k - 1)
            elif (1 << (k - 1)) <= i < (1 << k) - 1:
                return _luby(i - (1 << (k - 1)) + 1)
            k += 1

    return unit * _luby(restarts) == len(queue_lbd)

Glucose

Glucose restarts were popularized by the Glucose solver, and it is an extremely aggressive, dynamic restart strategy. The idea behind it and described in [7] is that instead of waiting for a fixed amount of conflicts, we restart when the last couple of learnt clauses are, on average, bad.

A bit more precisely, if there were at least $X$ conflicts (and thus $X$ learnt clauses) since the last restart, and the average Literal Block Distance (LBD) (a criterion to evaluate the quality of learnt clauses as shown in [8] of the last $X$ learnt clauses was at least $K$ times higher than the average LBD of all learnt clauses, it is time for another restart. Parameters $X$ and $K$ can be tweaked to achieve different restart frequency, and they are usually kept quite small.


In [10]:
%psource glucose


def glucose(conflicts, restarts, queue_lbd, sum_lbd, x=100, k=0.7):
    # in the state-of-art tested with (x, k) as (50, 0.8) and (100, 0.7)
    # if there were at least x conflicts since the last restart, and then the average LBD of the last
    # x learnt clauses was at least k times higher than the average LBD of all learnt clauses
    return len(queue_lbd) >= x and sum(queue_lbd) / len(queue_lbd) * k > sum_lbd / conflicts

Experimental Results


In [22]:
from csp import *

Australia

CSP


In [23]:
australia_csp = MapColoringCSP(list('RGB'), """SA: WA NT Q NSW V; NT: WA Q; NSW: Q V; T: """)

In [24]:
%time _, checks = AC3b(australia_csp, arc_heuristic=dom_j_up)
f'AC3b with DOM J UP needs {checks} consistency-checks'


CPU times: user 154 µs, sys: 37 µs, total: 191 µs
Wall time: 194 µs
Out[24]:
'AC3b with DOM J UP needs 72 consistency-checks'

In [25]:
%time backtracking_search(australia_csp, select_unassigned_variable=mrv, inference=forward_checking)


CPU times: user 263 µs, sys: 0 ns, total: 263 µs
Wall time: 268 µs
Out[25]:
{'Q': 'R', 'SA': 'G', 'NSW': 'B', 'NT': 'B', 'V': 'R', 'WA': 'R'}

SAT


In [2]:
australia_sat = MapColoringSAT(list('RGB'), """SA: WA NT Q NSW V; NT: WA Q; NSW: Q V; T: """)
DPLL

In [3]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=no_branching_heuristic)


CPU times: user 43.3 ms, sys: 0 ns, total: 43.3 ms
Wall time: 41.5 ms

In [28]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=moms)


CPU times: user 36.4 ms, sys: 0 ns, total: 36.4 ms
Wall time: 35.3 ms

In [29]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=momsf)


CPU times: user 36.1 ms, sys: 3.9 ms, total: 40 ms
Wall time: 39.2 ms

In [30]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=posit)


CPU times: user 45.2 ms, sys: 0 ns, total: 45.2 ms
Wall time: 44.2 ms

In [31]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=zm)


CPU times: user 31.2 ms, sys: 0 ns, total: 31.2 ms
Wall time: 30.5 ms

In [32]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=dlis)


CPU times: user 57 ms, sys: 0 ns, total: 57 ms
Wall time: 55.9 ms

In [33]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=dlcs)


CPU times: user 51.8 ms, sys: 0 ns, total: 51.8 ms
Wall time: 50.7 ms

In [34]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=jw)


CPU times: user 40.6 ms, sys: 0 ns, total: 40.6 ms
Wall time: 39.3 ms

In [35]:
%time model = dpll_satisfiable(australia_sat, branching_heuristic=jw2)


CPU times: user 43.2 ms, sys: 1.81 ms, total: 45.1 ms
Wall time: 43.9 ms
CDCL

In [36]:
%time model = cdcl_satisfiable(australia_sat)


CPU times: user 32.9 ms, sys: 16 µs, total: 33 ms
Wall time: 31.6 ms

In [37]:
{var for var, val in model.items() if val}


Out[37]:
{NSW_B, NT_B, Q_G, SA_R, V_G, WA_G}

France

CSP


In [38]:
france_csp = MapColoringCSP(list('RGBY'),
                            """AL: LO FC; AQ: MP LI PC; AU: LI CE BO RA LR MP; BO: CE IF CA FC RA
                            AU; BR: NB PL; CA: IF PI LO FC BO; CE: PL NB NH IF BO AU LI PC; FC: BO
                            CA LO AL RA; IF: NH PI CA BO CE; LI: PC CE AU MP AQ; LO: CA AL FC; LR:
                            MP AU RA PA; MP: AQ LI AU LR; NB: NH CE PL BR; NH: PI IF CE NB; NO:
                            PI; PA: LR RA; PC: PL CE LI AQ; PI: NH NO CA IF; PL: BR NB CE PC; RA:
                            AU BO FC PA LR""")

In [39]:
%time _, checks = AC3b(france_csp, arc_heuristic=dom_j_up)
f'AC3b with DOM J UP needs {checks} consistency-checks'


CPU times: user 599 µs, sys: 112 µs, total: 711 µs
Wall time: 716 µs
Out[39]:
'AC3b with DOM J UP needs 516 consistency-checks'

In [40]:
%time backtracking_search(france_csp, select_unassigned_variable=mrv, inference=forward_checking)


CPU times: user 560 µs, sys: 0 ns, total: 560 µs
Wall time: 563 µs
Out[40]:
{'NH': 'R',
 'NB': 'G',
 'CE': 'B',
 'PL': 'R',
 'BR': 'B',
 'IF': 'G',
 'PI': 'B',
 'BO': 'R',
 'CA': 'Y',
 'FC': 'G',
 'LO': 'R',
 'PC': 'G',
 'AU': 'G',
 'AL': 'B',
 'RA': 'B',
 'LR': 'R',
 'LI': 'R',
 'AQ': 'B',
 'MP': 'Y',
 'PA': 'G',
 'NO': 'R'}

SAT


In [41]:
france_sat = MapColoringSAT(list('RGBY'),
                            """AL: LO FC; AQ: MP LI PC; AU: LI CE BO RA LR MP; BO: CE IF CA FC RA
                            AU; BR: NB PL; CA: IF PI LO FC BO; CE: PL NB NH IF BO AU LI PC; FC: BO
                            CA LO AL RA; IF: NH PI CA BO CE; LI: PC CE AU MP AQ; LO: CA AL FC; LR:
                            MP AU RA PA; MP: AQ LI AU LR; NB: NH CE PL BR; NH: PI IF CE NB; NO:
                            PI; PA: LR RA; PC: PL CE LI AQ; PI: NH NO CA IF; PL: BR NB CE PC; RA:
                            AU BO FC PA LR""")
DPLL

In [42]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=no_branching_heuristic)


CPU times: user 3.32 s, sys: 0 ns, total: 3.32 s
Wall time: 3.32 s

In [43]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=moms)


CPU times: user 3.17 s, sys: 390 µs, total: 3.17 s
Wall time: 3.17 s

In [44]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=momsf)


CPU times: user 3.49 s, sys: 0 ns, total: 3.49 s
Wall time: 3.49 s

In [45]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=posit)


CPU times: user 3.5 s, sys: 0 ns, total: 3.5 s
Wall time: 3.5 s

In [46]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=zm)


CPU times: user 3 s, sys: 2.6 ms, total: 3.01 s
Wall time: 3.01 s

In [47]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=dlis)


CPU times: user 12.5 s, sys: 11.4 ms, total: 12.5 s
Wall time: 12.5 s

In [48]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=dlcs)


CPU times: user 3.41 s, sys: 0 ns, total: 3.41 s
Wall time: 3.41 s

In [49]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=jw)


CPU times: user 2.92 s, sys: 3.89 ms, total: 2.92 s
Wall time: 2.92 s

In [50]:
%time model = dpll_satisfiable(france_sat, branching_heuristic=jw2)


CPU times: user 3.71 s, sys: 0 ns, total: 3.71 s
Wall time: 3.73 s
CDCL

In [51]:
%time model = cdcl_satisfiable(france_sat)


CPU times: user 159 ms, sys: 3.94 ms, total: 163 ms
Wall time: 162 ms

In [52]:
{var for var, val in model.items() if val}


Out[52]:
{AL_G,
 AQ_G,
 AU_R,
 BO_G,
 BR_Y,
 CA_R,
 CE_B,
 FC_B,
 IF_Y,
 LI_Y,
 LO_Y,
 LR_G,
 MP_B,
 NB_R,
 NH_G,
 NO_Y,
 PA_B,
 PC_R,
 PI_B,
 PL_G,
 RA_Y}

USA

CSP


In [53]:
usa_csp = MapColoringCSP(list('RGBY'),
                         """WA: OR ID; OR: ID NV CA; CA: NV AZ; NV: ID UT AZ; ID: MT WY UT;
                         UT: WY CO AZ; MT: ND SD WY; WY: SD NE CO; CO: NE KA OK NM; NM: OK TX AZ;
                         ND: MN SD; SD: MN IA NE; NE: IA MO KA; KA: MO OK; OK: MO AR TX;
                         TX: AR LA; MN: WI IA; IA: WI IL MO; MO: IL KY TN AR; AR: MS TN LA;
                         LA: MS; WI: MI IL; IL: IN KY; IN: OH KY; MS: TN AL; AL: TN GA FL;
                         MI: OH IN; OH: PA WV KY; KY: WV VA TN; TN: VA NC GA; GA: NC SC FL;
                         PA: NY NJ DE MD WV; WV: MD VA; VA: MD DC NC; NC: SC; NY: VT MA CT NJ;
                         NJ: DE; DE: MD; MD: DC; VT: NH MA; MA: NH RI CT; CT: RI; ME: NH;
                         HI: ; AK: """)

In [54]:
%time _, checks = AC3b(usa_csp, arc_heuristic=dom_j_up)
f'AC3b with DOM J UP needs {checks} consistency-checks'


CPU times: user 1.58 ms, sys: 17 µs, total: 1.6 ms
Wall time: 1.6 ms
Out[54]:
'AC3b with DOM J UP needs 1284 consistency-checks'

In [55]:
%time backtracking_search(usa_csp, select_unassigned_variable=mrv, inference=forward_checking)


CPU times: user 2.15 ms, sys: 0 ns, total: 2.15 ms
Wall time: 2.15 ms
Out[55]:
{'NM': 'R',
 'TX': 'G',
 'OK': 'B',
 'AR': 'R',
 'MO': 'G',
 'KA': 'R',
 'LA': 'B',
 'NE': 'B',
 'TN': 'B',
 'MS': 'G',
 'IA': 'R',
 'SD': 'G',
 'IL': 'B',
 'CO': 'G',
 'MN': 'B',
 'KY': 'R',
 'AL': 'R',
 'GA': 'G',
 'FL': 'B',
 'VA': 'G',
 'WI': 'G',
 'IN': 'G',
 'NC': 'R',
 'WV': 'B',
 'OH': 'Y',
 'PA': 'R',
 'MD': 'Y',
 'SC': 'B',
 'MI': 'R',
 'DC': 'R',
 'DE': 'G',
 'WY': 'R',
 'ND': 'R',
 'NJ': 'B',
 'NY': 'G',
 'UT': 'B',
 'AZ': 'G',
 'ID': 'G',
 'MT': 'B',
 'NV': 'R',
 'CA': 'B',
 'OR': 'Y',
 'WA': 'R',
 'VT': 'R',
 'MA': 'B',
 'NH': 'G',
 'CT': 'R',
 'RI': 'G',
 'ME': 'R'}

SAT


In [56]:
usa_sat = MapColoringSAT(list('RGBY'),
                         """WA: OR ID; OR: ID NV CA; CA: NV AZ; NV: ID UT AZ; ID: MT WY UT;
                         UT: WY CO AZ; MT: ND SD WY; WY: SD NE CO; CO: NE KA OK NM; NM: OK TX AZ;
                         ND: MN SD; SD: MN IA NE; NE: IA MO KA; KA: MO OK; OK: MO AR TX;
                         TX: AR LA; MN: WI IA; IA: WI IL MO; MO: IL KY TN AR; AR: MS TN LA;
                         LA: MS; WI: MI IL; IL: IN KY; IN: OH KY; MS: TN AL; AL: TN GA FL;
                         MI: OH IN; OH: PA WV KY; KY: WV VA TN; TN: VA NC GA; GA: NC SC FL;
                         PA: NY NJ DE MD WV; WV: MD VA; VA: MD DC NC; NC: SC; NY: VT MA CT NJ;
                         NJ: DE; DE: MD; MD: DC; VT: NH MA; MA: NH RI CT; CT: RI; ME: NH;
                         HI: ; AK: """)
DPLL

In [57]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=no_branching_heuristic)


CPU times: user 46.2 s, sys: 0 ns, total: 46.2 s
Wall time: 46.2 s

In [58]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=moms)


CPU times: user 54.6 s, sys: 0 ns, total: 54.6 s
Wall time: 54.6 s

In [59]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=momsf)


CPU times: user 44 s, sys: 0 ns, total: 44 s
Wall time: 44 s

In [60]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=posit)


CPU times: user 43.8 s, sys: 0 ns, total: 43.8 s
Wall time: 43.8 s

In [61]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=zm)


CPU times: user 52.6 s, sys: 0 ns, total: 52.6 s
Wall time: 52.6 s

In [62]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=dlis)


CPU times: user 57 s, sys: 0 ns, total: 57 s
Wall time: 57 s

In [63]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=dlcs)


CPU times: user 43.8 s, sys: 0 ns, total: 43.8 s
Wall time: 43.8 s

In [64]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=jw)


CPU times: user 53.3 s, sys: 3.82 ms, total: 53.3 s
Wall time: 53.3 s

In [65]:
%time model = dpll_satisfiable(usa_sat, branching_heuristic=jw2)


CPU times: user 44 s, sys: 3.99 ms, total: 44 s
Wall time: 44 s
CDCL

In [66]:
%time model = cdcl_satisfiable(usa_sat)


CPU times: user 559 ms, sys: 0 ns, total: 559 ms
Wall time: 558 ms

In [67]:
{var for var, val in model.items() if val}


Out[67]:
{AL_B,
 AR_B,
 AZ_R,
 CA_B,
 CO_R,
 CT_Y,
 DC_G,
 DE_Y,
 FL_Y,
 GA_R,
 IA_B,
 ID_Y,
 IL_G,
 IN_R,
 KA_G,
 KY_B,
 LA_G,
 MA_G,
 MD_R,
 ME_G,
 MI_G,
 MN_Y,
 MO_R,
 MS_Y,
 MT_B,
 NC_B,
 ND_G,
 NE_Y,
 NH_Y,
 NJ_G,
 NM_G,
 NV_G,
 NY_R,
 OH_Y,
 OK_Y,
 OR_R,
 PA_B,
 RI_B,
 SC_Y,
 SD_R,
 TN_G,
 TX_R,
 UT_B,
 VA_Y,
 VT_B,
 WA_B,
 WI_R,
 WV_G,
 WY_G}

Zebra Puzzle

CSP


In [76]:
zebra_csp = Zebra()

In [77]:
zebra_csp.display(zebra_csp.infer_assignment())


{'Milk': 3, 'Norwegian': 1}

In [78]:
%time _, checks = AC3b(zebra_csp, arc_heuristic=dom_j_up)
f'AC3b with DOM J UP needs {checks} consistency-checks'


CPU times: user 2.04 ms, sys: 4 µs, total: 2.05 ms
Wall time: 2.05 ms
Out[78]:
'AC3b with DOM J UP needs 737 consistency-checks'

In [71]:
zebra_csp.display(zebra_csp.infer_assignment())


{'Blue': 2, 'Milk': 3, 'Norwegian': 1}

In [72]:
%time backtracking_search(zebra_csp, select_unassigned_variable=mrv, inference=forward_checking)


CPU times: user 2.13 ms, sys: 0 ns, total: 2.13 ms
Wall time: 2.14 ms
Out[72]:
{'Milk': 3,
 'Blue': 2,
 'Norwegian': 1,
 'Coffee': 5,
 'Green': 5,
 'Ivory': 4,
 'Red': 3,
 'Yellow': 1,
 'Kools': 1,
 'Englishman': 3,
 'Horse': 2,
 'Tea': 2,
 'Ukranian': 2,
 'Spaniard': 4,
 'Dog': 4,
 'Japanese': 5,
 'Parliaments': 5,
 'LuckyStrike': 4,
 'OJ': 4,
 'Water': 1,
 'Chesterfields': 2,
 'Winston': 3,
 'Snails': 3,
 'Fox': 1,
 'Zebra': 5}

SAT


In [5]:
zebra_sat = associate('&', map(to_cnf, map(expr, filter(lambda line: line[0] not in ('c', 'p'), open('aima-data/zebra.cnf').read().splitlines()))))
DPLL

In [14]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=no_branching_heuristic)


CPU times: user 13min 6s, sys: 2.44 ms, total: 13min 6s
Wall time: 13min 6s

In [16]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=moms)


CPU times: user 15min 4s, sys: 22.4 ms, total: 15min 4s
Wall time: 15min 4s

In [17]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=momsf)


CPU times: user 22min 28s, sys: 40 ms, total: 22min 28s
Wall time: 22min 28s

In [18]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=posit)


CPU times: user 22min 25s, sys: 36 ms, total: 22min 25s
Wall time: 22min 25s

In [19]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=zm)


CPU times: user 14min 52s, sys: 32 ms, total: 14min 52s
Wall time: 14min 52s

In [19]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=dlis)


CPU times: user 2min 31s, sys: 9.87 ms, total: 2min 31s
Wall time: 2min 32s

In [21]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=dlcs)


CPU times: user 4min 27s, sys: 12 ms, total: 4min 27s
Wall time: 4min 27s

In [20]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=jw)


CPU times: user 6min 55s, sys: 39.2 ms, total: 6min 55s
Wall time: 6min 56s

In [75]:
%time model = dpll_satisfiable(zebra_sat, branching_heuristic=jw2)


CPU times: user 8min 57s, sys: 7.94 ms, total: 8min 57s
Wall time: 8min 57s
CDCL

In [7]:
%time model = cdcl_satisfiable(zebra_sat)


CPU times: user 1.64 s, sys: 0 ns, total: 1.64 s
Wall time: 1.64 s

In [8]:
{var for var, val in model.items() if val and var.op.startswith(('Englishman', 'Japanese', 'Norwegian', 'Spaniard', 'Ukrainian'))}


Out[8]:
{Englishman_house2,
 Englishman_milk,
 Englishman_oldGold,
 Englishman_redHouse,
 Englishman_snails,
 Japanese_coffee,
 Japanese_greenHouse,
 Japanese_house4,
 Japanese_parliament,
 Japanese_zebra,
 Norwegian_fox,
 Norwegian_house0,
 Norwegian_kool,
 Norwegian_water,
 Norwegian_yellowHouse,
 Spaniard_dog,
 Spaniard_house3,
 Spaniard_ivoryHouse,
 Spaniard_luckyStrike,
 Spaniard_orangeJuice,
 Ukrainian_blueHouse,
 Ukrainian_chesterfield,
 Ukrainian_horse,
 Ukrainian_house1,
 Ukrainian_tea}

References

[[1]](#ref-1) Freeman, Jon William. 1995. Improvements to propositional satisfiability search algorithms.

[[2]](#ref-2) Zabih, Ramin and McAllester, David A. 1988. A Rearrangement Search Strategy for Determining Propositional Satisfiability.

[[3]](#ref-3) Jeroslow, Robert G and Wang, Jinchang. 1990. Solving propositional satisfiability problems.

[[4]](#ref-4) Moskewicz, Matthew W and Madigan, Conor F and Zhao, Ying and Zhang, Lintao and Malik, Sharad. 2001. Chaff: Engineering an efficient SAT solver.

[[5]](#ref-5) Haim, Shai and Heule, Marijn. 2014. Towards ultra rapid restarts.

[[6]](#ref-6) Huang, Jinbo and others. 2007. The Effect of Restarts on the Efficiency of Clause Learning.

[[7]](#ref-7) Audemard, Gilles and Simon, Laurent. 2012. Refining restarts strategies for SAT and UNSAT.

[[8]](#ref-8) Audemard, Gilles and Simon, Laurent. 2009. Predicting learnt clauses quality in modern SAT solvers.