The Czech-Slovak Olympiad problem: Exploring steps.

We visit the problem once more. This time we are looking for whether things work following heuristics implemented by hand. We will also be more explicit in naming stuff. We will also have a few more steps, which are flips of the earlier steps. In some cases we only consider the flips. This time we will not try to generate statements of lemmas, as this was (barely) successful last time.

The problem

Let ⋆ be a binary operation on a nonempty set $M$. That is, every pair $(a,b) \in M$ is assigned an element $a$ ⋆$ b$ in $M$. Suppose that ⋆ has the additional property that $(a $ ⋆ $b) $ ⋆$ b= a$ and $a$ ⋆ $(a$ ⋆$ b)= b$ for all $a,b \in M$. Show that $a$ ⋆ $b = b$ ⋆ $a$ for all $a,b \in M$.

We derive the following results.

  1. $ m * n = n * m $ (the final claim; by symmetry from 8)
  2. $ (m * n) * n = m $ (instantiation)
  3. $ (m * n)*((m * n) * n) = n$ (instantiation)
  4. $ n = (m * n)*((m * n)* n)$ (symmetry from 3)
  5. $ (m * n)*((m * n)* n) = (m * n)* m $ (left multiplication by $m * n$ of 2
  6. $ ((m * n)* m))* m = m * n$ (instantiation)
  7. $ n = (m * n)* m$ (transitivity from 4 and 5)
  8. $ n * m = ((m * n)* m)* m $ (right multiplication from 7)
  9. $ n * m = m * n $ (transitivity from 8 and 6)

In this notebook we do not consider selection of the correct lemmas, but just the generation.


In [1]:
import $cp.bin.`provingground-core-jvm-a09844e8fb.fat.jar`
import provingground._ , interface._, HoTT._, learning._ 
repl.pprinter() = {
  val p = repl.pprinter()
  p.copy(
    additionalHandlers = p.additionalHandlers.orElse {
      translation.FansiShow.fansiHandler
    }
  )
}


Out[1]:
import $cp.$                                              

import provingground._ , interface._, HoTT._, learning._ 

In [2]:
val M = "M" :: Type

val eqM = "eqM" :: M ->: M ->: Type

val a = "a" :: M
val b = "b" :: M
val c = "c" :: M

val m = "m" :: M

val n = "n" :: M

val mul = "mul" :: M ->: M ->: M


Out[2]:
M: Typ[Term] = M
eqM: Func[Term, Func[Term, Typ[Term]]] = eqM
a: Term = a
b: Term = b
c: Term = c
m: Term = m
n: Term = n
mul: Func[Term, Func[Term, Term]] = mul

We first make all the statements.


In [3]:
val results = Vector(
    eqM(mul(m)(n))(mul(n)(m)),
    eqM(mul(mul(m)(n))(n))(m),
    eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n),
    eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n))),
    eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m)),
    eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)),
    eqM(n)(mul(mul(m)(n))(m)),
    eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)),
    eqM(mul(n)(m))(mul(m)(n))
)


Out[3]:
results: Vector[Typ[Term]] = Vector(
  eqM(mul(m)(n))(mul(n)(m)),
  eqM(mul(mul(m)(n))(n))(m),
  eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n),
  eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n))),
  eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m)),
  eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)),
  eqM(n)(mul(mul(m)(n))(m)),
  eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)),
  eqM(mul(n)(m))(mul(m)(n))
)

We shall now introduce all the axioms.


In [4]:
val refl = "refl" :: a ~>: (eqM(a)(a))

val sym = "sym" :: a ~>: (b ~>: (eqM(a)(b) ->: eqM(b)(a)))

val trans =
    "trans" :: a ~>:
      (b ~>: (c ~>: ((eqM(a)(b)) ->: (eqM(b)(c)) ->: (eqM(a)(c)))))

val leftMul = "left-multiply" :: a ~>: (b ~>: (c ~>: (eqM(b)(c) ->: eqM(mul(a)(b))(mul(a)(c)))))
val rightMul = "right-multiply" :: a ~>: (b ~>: (c ~>: (eqM(b)(c) ->: eqM(mul(b)(a))(mul(c)(a)))))

val ass1 = "ass1" :: a ~>: (b ~>: eqM(mul(mul(a)(b))(b))(a))
val ass2 = "ass2" :: a ~>: (b ~>: eqM(mul(a)(mul(a)(b)))(b))


Out[4]:
refl: FuncLike[Term, Term] = refl
sym: FuncLike[Term, FuncLike[Term, Func[Term, Term]]] = sym
trans: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Func[Term, Term]]]]] = trans
leftMul: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Term]]]] = left-multiply
rightMul: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Term]]]] = right-multiply
ass1: FuncLike[Term, FuncLike[Term, Term]] = ass1
ass2: FuncLike[Term, FuncLike[Term, Term]] = ass2

Formal proving

Before considering proof discovery, we look for proof representation.


In [5]:
val pf1 = ass1(m)(n)
pf1.typ


Out[5]:
pf1: Term = ass1(m)(n)
res4_1: Typ[U] = eqM(mul(mul(m)(n))(n))(m)

In [6]:
pf1.typ == results(1)


Out[6]:
res5: Boolean = true

In [7]:
val mn = mul(m)(n)
val pf2 = ass2(mn)(n)


Out[7]:
mn: Term = mul(m)(n)
pf2: Term = ass2(mul(m)(n))(n)

In [8]:
pf2.typ


Out[8]:
res7: Typ[U] = eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)

In [9]:
pf2.typ == results(2)


Out[9]:
res8: Boolean = true

In [10]:
Unify.appln(sym, pf2)


Out[10]:
res9: Option[Term] = Some(
  sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n))
)

In [11]:
val pf3 = Unify.appln(sym, pf2).get
pf3.typ


Out[11]:
pf3: Term = sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n))
res10_1: Typ[U] = eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))

In [12]:
pf3.typ == results(3)


Out[12]:
res11: Boolean = true

In [13]:
Unify.appln(leftMul(mn), pf1)


Out[13]:
res12: Option[Term] = Some(
  left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))
)

In [14]:
val pf4 = Unify.appln(leftMul(mn), pf1).get
pf4.typ
pf4.typ == results(4)


Out[14]:
pf4: Term = left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))
res13_1: Typ[U] = eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))
res13_2: Boolean = true

In [15]:
val pf5 = ass1(mn)(m)
pf5.typ
pf5.typ == results(5)


Out[15]:
pf5: Term = ass1(mul(m)(n))(m)
res14_1: Typ[U] = eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))
res14_2: Boolean = true

In [16]:
Unify.appln(trans, pf3)


Out[16]:
res15: Option[Term] = Some(
  ($inr : M) ↦ trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))($inr)(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))
)

In [17]:
val step6 = Unify.appln(trans, pf3).get
Unify.appln(step6, pf4)


Out[17]:
step6: Term = ($lzd : M) ↦ trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))($lzd)(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))
res16_1: Option[Term] = Some(
  trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))
)

In [18]:
val pf6 = Unify.appln(step6, pf4).get
pf6.typ
pf6.typ == results(6)


Out[18]:
pf6: Term = trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))
res17_1: Typ[U] = eqM(n)(mul(mul(m)(n))(m))
res17_2: Boolean = true

In [19]:
val pf7 = Unify.appln(rightMul(m),pf6).get
pf7.typ
pf7.typ == results(7)


Out[19]:
pf7: Term = right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))))
res18_1: Typ[U] = eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))
res18_2: Boolean = true

In [20]:
val step8 = Unify.appln(trans, pf7).get


Out[20]:
step8: Term = ($zde : M) ↦ trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))($zde)(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))

In [21]:
val pf8 = Unify.appln(step8, pf5).get


Out[21]:
pf8: Term = trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))(ass1(mul(m)(n))(m))

In [22]:
pf8.typ
pf8.typ == results(8)


Out[22]:
res21_0: Typ[U] = eqM(mul(n)(m))(mul(m)(n))
res21_1: Boolean = true

In [23]:
val pf = Unify.appln(sym, pf8).get
pf.typ
pf.typ == results(0)


Out[23]:
pf: Term = sym(mul(n)(m))(mul(m)(n))(trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))(ass1(mul(m)(n))(m)))
res22_1: Typ[U] = eqM(mul(m)(n))(mul(n)(m))
res22_2: Boolean = true

Conclusions

  • We have successfully formalized the proof.
  • We will now try to generate the steps, ruthlessly narrowing.

In [24]:
val lp1 = LocalProver(TermState(FiniteDistribution.unif(ass1, ass2, m, n, mn), FiniteDistribution.empty)).noIsles.sharpen(10)


Out[24]:
lp1: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(ass1, 0.2),
        Weighted(mul(m)(n), 0.2),
        Weighted(ass2, 0.2),
        Weighted(m, 0.2),
        Weighted(n, 0.2)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
    0.0,
    OrElse(
      OrElse(OrElse(OrElse(<function1>, <function1>), <function1>), <function1>),
...

In [25]:
val terms1T = lp1.nextState.map(_.terms).memoize
val r1 = terms1T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize


Out[25]:
terms1T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r1: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)

In [26]:
import monix.execution.Scheduler.Implicits.global
terms1T.runToFuture



In [28]:
val f1 = r1.runToFuture


Interim conclusion

We see that all the instantiation lemmas have been obtained.


In [35]:
val lp2 = LocalProver(TermState(FiniteDistribution.unif(pf1, pf2, sym, leftMul, m, n, mn), FiniteDistribution.empty)).noIsles.sharpen(10)


Out[35]:
lp2: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(m, 0.14285714285714285),
        Weighted(sym, 0.14285714285714285),
        Weighted(mul(m)(n), 0.14285714285714285),
        Weighted(left-multiply, 0.14285714285714285),
        Weighted(ass2(mul(m)(n))(n), 0.14285714285714285),
        Weighted(n, 0.14285714285714285),
        Weighted(ass1(m)(n), 0.14285714285714285)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
    0.0,
    OrElse(
...

In [36]:
val terms2T = lp2.nextState.map(_.terms).memoize
val r2 = terms2T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize


Out[36]:
terms2T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r2: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)

In [37]:
terms2T.runToFuture



In [38]:
// val r2 = terms2T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
val f2 = r2.runToFuture


The consequences 3 and 4 were found with moderately focussed generation. We next try using transitivity. This will be very focussed but with a lower cutoff.


In [39]:
val lp3 = LocalProver(TermState(FiniteDistribution.unif(pf3, pf4, trans), FiniteDistribution.empty)).noIsles


Out[39]:
lp3: LocalProver = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(
          sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)),
          0.3333333333333333
        ),
        Weighted(
          left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)),
          0.3333333333333333
        ),
        Weighted(trans, 0.3333333333333333)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
...

In [40]:
val terms3T = lp3.nextState.map(_.terms).memoize
val r3 = terms3T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize


Out[40]:
terms3T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r3: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)

In [41]:
terms3T.runToFuture



In [42]:
val f3 = r3.runToFuture


This worked, with a very narrow focus but quickly. We now aim for Lemma 7, which is again right multiplication.


In [43]:
val lp4 = LocalProver(TermState(FiniteDistribution.unif(pf6, rightMul, m, n, mn), FiniteDistribution.empty)).noIsles
val terms4T = lp4.nextState.map(_.terms).memoize
val r4 = terms4T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize


Out[43]:
lp4: LocalProver = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(right-multiply, 0.2),
        Weighted(mul(m)(n), 0.2),
        Weighted(m, 0.2),
        Weighted(
          trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))),
          0.2
        ),
        Weighted(n, 0.2)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
...
terms4T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r4: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)

In [44]:
terms4T.runToFuture



In [45]:
val f4 = r4.runToFuture



In [46]:
f4.value


Out[46]:
res45: Option[scala.util.Try[Vector[(Typ[Term], Int)]]] = Some(
  Success(
    Vector(
      (eqM(n)(mul(mul(m)(n))(m)), 6),
      (eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)), 7)
    )
  )
)

The Lemma 7 was obtained with a fairly focus, but with the default cutoff. We now look for Lemma 8 and the Theorem. It will probably take two steps.


In [47]:
val lp5 = LocalProver(TermState(FiniteDistribution.unif(pf5, pf7, sym, trans), FiniteDistribution.empty)).noIsles.sharpen(10)
val terms5T = lp5.nextState.map(_.terms).memoize
val r5 = terms5T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize


Out[47]:
lp5: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(ass1(mul(m)(n))(m), 0.25),
        Weighted(
          right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))),
          0.25
        ),
        Weighted(sym, 0.25),
        Weighted(trans, 0.25)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
...
terms5T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r5: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)

In [48]:
terms5T.runToFuture



In [49]:
val f5 = r5.runToFuture



In [50]:
f5.value


Out[50]:
res49: Option[scala.util.Try[Vector[(Typ[Term], Int)]]] = Some(
  Success(
    Vector(
      (eqM(mul(m)(n))(mul(n)(m)), 0),
      (eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)), 5),
      (eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)), 7),
      (eqM(mul(n)(m))(mul(m)(n)), 8)
    )
  )
)

We got the theorem with symmetry and transitivity working together.

Conclusions

  • We could generate all the lemmas and the final theorem.
  • While we did focus, it was not too gross.