In [1]:
# Copyright (c) Thalesians Ltd, 2018-2019. All rights reserved
# Copyright (c) Paul Alexander Bilokon, 2018-2019. All rights reserved
# Author: Paul Alexander Bilokon <paul@thalesians.com>
# Version: 1.0 (2018.07.22)
# Email: education@thalesians.com
# Platform: Tested on Windows 10 with Python 3.6

In [2]:
%matplotlib inline

Set theory and proofs

Mathematics, as it is taught at school, is usually taught by example. Pupils are asked to learn the various mathematical "laws" by rote. For example, we all know from school that $$3 + 5 = 8$$ and $$\frac{d}{dx} x^2 = 2x.$$ We rarely pause and ask, why is this the case, and what are we really doing?

In fact, a lot is going on in these equations. For example, we would have to spend a lot of time explaining what $dx$ "really" is in $$\frac{d}{dx} x^2 = 2x.$$ We would either have to introduce the notion of an infinitesimal or resort to Silvanus Thompson's explanation from Calculus Made Easy (1910):

$dx$ means a little bit of $x$.

This explanation may be sufficient if we want to do mathematics rather than understand it. Thompson uses the Ancient Simian Proverb as an epigraph to his work:

What one fool can do, another can.

What are we doing, what is mathematics anyway? "Mathematics" is not a "-logy", unlike many of the sciences. The word itself comes from the Ancient Greek μαθηματικός (mathēmatikós, "fond of learning"). Thus mathematics is... a form of learning? This definition is as abstract as mathematics itself...

It was probably G. H. Hardy in A Mathematician's Apology (1940) who defined mathematics as the "study of patterns":

A mathematician, like a painter or a poet, is a maker of patterns. If his patterns are more permanent than theirs, it is because they are made with ideas.

Lynn Arthur Steen seconds him in The Science of Patterns (Science, 1988):

Mathematics is often defined as the science of space and number, as the discipline rooted in geometry and arithmetic. Although the diversity of modern mathematics has always exceeded this definition, it was not until the recent resonance of computers and mathematics that a more apt definition became fully evident. Mathematics is the science of patterns. The mathematician seeks patterns in number, in space, in science, in computers, and in imagination.

According to this definition, mathematics is a study of patterns. And patterns are more general than numbers. Therefore, a number cannot be the most basic "unit" of study in mathematics. What is this most basic "unit"? What abstraction could we introduce to study patterns in all their various forms, including numbers?

In 1874, Georg Cantor introduced just such an abstraction in On a Property of the Collection of All Real Algebraic Numbers: the set. It has proved to be a very fruitful abstraction, allowing us, among other things, to formalize the notion of infinity.

Sets

A set is (arguably) the most fundamental object in mathematics. It is a collection of distinct objects. For example, we could talk about the set of numbers from one up to ten, inclusive. We could give this set a name, say, $S$, and write it like so: $$S = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}.$$ We refer to objects in a particular set as its elements. We have just defined a particular set by listing its elements. Thus 3 is an element of $S$, and we write $$3 \in S.$$ On the other hand, 11 is not an element of $S$, and we write $$11 \notin S.$$ Neither, in fact, is Joe: $$\text{Joe} \notin S.$$

Finite and infinite sets

This particular set, $$S = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\},$$ is finite. Consider the set of all natural numbers: $$\mathbb{N} = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, ...\}.$$ There is no way to list all of its elements, of course, but at least, if we had an infinite amount of time, we could enumerate (list) them: "one", "two", "three", and so on. We call such sets countably infinite or denumerable.

Equivalence of sets

Two sets, $A$ and $B$ are said to be equivalent or equinumerous (we write $A \sim B$) if a one-to-one correspondence can be set up between all their elements. For example, the sets $A = \{1, 2, 3\}$ and $B = \{a, b, c\}$ are equivalent: $$ 1 \mapsto a, \\ 2 \mapsto b, \\ 3 \mapsto c.$$ This is not the only such one-to-one correspondence; for example, we could use this one: $$ 1 \mapsto b, \\ 2 \mapsto a, \\ 3 \mapsto c. $$

On the other hand, $A = \{1, 2\}$ and $B = \{a, b, c\}$ are not equivalent: we need a one-to-one correspondence between all elements of $A$ and all the elements of $B$, but $A$ has fewer elements than $B$.

Finite sets are equivalent if and only if (or, to use Paul Halmos's abbreviation, iff) they have the same number of elements. In fact, equivalence is a generalization of this notion (that the sets have the same number of elements) to sets with infinitely many elements.

Thus every countably infinite (denumerable) set is equivalent to the set of natural numbers, $\mathbb{N}$: enumerating a set or listing its elements is the same as finding a one-to-one correspondence between the elements of this set and natural numbers.

Are positive rational numbers (i.e. fractions $\frac{p}{q}$, $p, q \in \mathbb{N}$, $q \neq 0$) denumerable? What do you think?

First, note that positive rationals can be arranged in a table:

$p$: $1$ $2$ $3$ $4$ $5$ $\ldots$
$q$
$1$ $\frac{1}{1}$ $\frac{2}{1}$ $\frac{3}{1}$ $\frac{4}{1}$ $\frac{5}{1}$ $\ldots$
$2$ $\frac{1}{2}$ $\frac{2}{2}$ $\frac{3}{2}$ $\frac{4}{2}$ $\frac{5}{2}$ $\ldots$
$3$ $\frac{1}{3}$ $\frac{2}{3}$ $\frac{3}{3}$ $\frac{4}{3}$ $\frac{5}{3}$ $\ldots$
$4$ $\frac{1}{4}$ $\frac{2}{4}$ $\frac{3}{4}$ $\frac{4}{4}$ $\frac{5}{4}$ $\ldots$
$5$ $\frac{1}{5}$ $\frac{2}{5}$ $\frac{3}{5}$ $\frac{4}{5}$ $\frac{5}{5}$ $\ldots$
$\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\ddots$

Not all entries in this table are distinct, for example, $\frac{1}{1} = \frac{2}{2} = \frac{3}{3} = \frac{4}{4} = \frac{5}{5}$, $\frac{2}{4} = \frac{1}{2}$, etc. Let us erase all rational numbers that have non-trivial common factors in the numerator and denominator, while keeping the first occurrence. All rational numbers in our table are now unique; if we continue the table indefinitely, it will include all positive rational numbers:

$p$: $1$ $2$ $3$ $4$ $5$ $\ldots$
$q$
$1$ $\frac{1}{1}$ $\frac{2}{1}$ $\frac{3}{1}$ $\frac{4}{1}$ $\frac{5}{1}$ $\ldots$
$2$ $\frac{1}{2}$ $\frac{3}{2}$ $\frac{5}{2}$ $\ldots$
$3$ $\frac{1}{3}$ $\frac{2}{3}$ $\frac{4}{3}$ $\frac{5}{3}$ $\ldots$
$4$ $\frac{1}{4}$ $\frac{3}{4}$ $\frac{5}{4}$ $\ldots$
$5$ $\frac{1}{5}$ $\frac{2}{5}$ $\frac{3}{5}$ $\frac{4}{5}$ $\ldots$
$\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\ddots$

Finally, we associate these numbers with the natural numbers $1, 2, 3, \ldots$. We start in the top-left corner of the table and then follow the arrows:

$p$: $1$ $2$ $3$ $4$ $5$ $\ldots$
$q$
$1$ $\frac{1}{1}$ $\frac{2}{1}$ $\rightarrow$ $\frac{3}{1}$ $\frac{4}{1}$ $\rightarrow$ $\frac{5}{1}$ $\ldots$
$\downarrow$ $\nearrow$ $\swarrow$ $\nearrow$ $\swarrow$
$2$ $\frac{1}{2}$ $\swarrow$ $\frac{3}{2}$ $\swarrow$ $\frac{5}{2}$ $\ldots$
$\swarrow$ $\nearrow$ $\swarrow$ $\nearrow$
$3$ $\frac{1}{3}$ $\frac{2}{3}$ $\swarrow$ $\frac{4}{3}$ $\frac{5}{3}$ $\ldots$
$\downarrow$ $\nearrow$ $\swarrow$ $\nearrow$ $\swarrow$
$4$ $\frac{1}{4}$ $\swarrow$ $\frac{3}{4}$ $\swarrow$ $\frac{5}{4}$ $\ldots$
$\swarrow$ $\nearrow$ $\swarrow$ $\nearrow$
$5$ $\frac{1}{5}$ $\frac{2}{5}$ $\frac{3}{5}$ $\frac{4}{5}$ $\swarrow$ $\ldots$
$\downarrow$ $\nearrow$ $\swarrow$ $\nearrow$ $\swarrow$
$\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\vdots$ $\ddots$

It is easy to see that all negative rational numbers are also denumerable: we simply put the minus sign before the fractions in the tables above.

What about all real numbers, including positive and negative rationals, such as $\frac{1}{2}$ and $-\frac{1}{2}$, zero, and irrationals, such as $\sqrt{2}$, $\pi = 3.1415926535\ldots$ and $e = 2.7182818284\ldots$? Clearly, this set (denoted $\mathbb{R}$) is also infinite. Can we enumerate it?

We know that all real numbers can be written as decimal fractions, e.g. $\pi = 3.1415926535...$. Suppose we can enumerate all real numbers between 0 and 1, inclusive: $$ a_1 = 0.a_{11}a_{12}a_{13}a_{14}a_{15}\ldots, \\ a_2 = 0.a_{21}a_{22}a_{23}a_{24}a_{25}\ldots, \\ a_3 = 0.a_{31}a_{32}a_{33}a_{34}a_{35}\ldots, \\ \vdots \\ a_k = 0.a_{k1}a_{k2}a_{k3}a_{k4}a_{k5}\ldots, \\ \vdots \\ $$ Here $a_{11}, a_{12}, a_{13}, a_{14}, a_{15}, \ldots, a_{21}, a_{22}, a_{23}, \ldots$ are all decimal digits, $0, 1, 2, 3, \ldots 9$.

If indeed it is possible to enumerate all real numbers between 0 and 1, then all of them appear on our list. But consider the number $b$ which differs from $a_1$ in the first digit (so that digit is anything but $a_{11}$), from $a_2$ in the second digit (so that digit is anything but $a_{22}$), from $a_3$ in the third digit (so that digit is anything but $a_{33}$), and so on. We have highlighted these digits below: $$ a_1 = 0.\mathbf{a_{11}}a_{12}a_{13}a_{14}a_{15}\ldots, \\ a_2 = 0.a_{21}\mathbf{a_{22}}a_{23}a_{24}a_{25}\ldots, \\ a_3 = 0.a_{31}a_{32}\mathbf{a_{33}}a_{34}a_{35}\ldots, \\ \vdots \\ a_k = 0.a_{k1}a_{k2}a_{k3}a_{k4}a_{k5}\ldots \mathbf{a_{kk}}\ldots, \\ \vdots \\ $$

By construction, $b$ differs from all numbers on our list, therefore $b$ is not on our list. So our attempt to enumerate all numbers between 0 and 1 (let alone all real numbers!) has failed.

We have just shown that real numbers are not enumerable using the so-called Cantor's diagonal slash argument. It is a "diagonal slash" for obvious reasons, while it's "Cantor's" because the aforementioned Georg Cantor discovered it.

Incidentally, Cantor's set theory, which has become the foundation of modern mathematics, was initially rejected by many prominent mathematicians. Leopold Kronecker, for instance, said:

I don't know what predominates in Cantor's theory — philosophy or theology, but I am sure that there is no mathematics there.

Thus the set $\mathbb{R}$, just like the set $\mathbb{N}$, is infinite, but it is even "more" infinite that $\mathbb{N}$: it is uncountable or uncountably infinite: we can't even enumerate it! In set theory this idea of different kinds of infinity generates further to the notion of cardinality.

Subsets and supersets

Notice that all elements of our example set $$S = \{1, 2, 3, 4, 5, 6, 7, 8, 9, 10\}$$ are also elements of $\mathbb{N}$. We say that $S$ is a subset of $\mathbb{N}$ and write $$S \subseteq \mathbb{N}.$$ Of course, $$\mathbb{N} \nsubseteq S.$$

We could, equivalently, say that $\mathbb{N}$ is a superset of $S$ and write $$\mathbb{N} \supseteq S.$$

Similarly, $$\mathbb{N} \subseteq \mathbb{R},$$ which is the same thing as $$\mathbb{R} \supseteq \mathbb{N},$$ and $$\mathbb{R} \nsubseteq \mathbb{N},$$ which is the same thing as $$\mathbb{N} \nsupseteq \mathbb{R}.$$

Since $S$ is a subset of $\mathbb{N}$, we could use the following "syntactic sugar" to define $S$: $$S = \{x \, | \, x \in \mathbb{N}, x \leq 10\}.$$ We read "$|$" as "such that". Thus, instead of listing all elements of a set, we could define it by mentioning a particular property of its elements, such as $x \leq 10$.

We can write $$|S| = 10$$ to indicate that $S$ has exactly 10 elements.

Equality of sets

Any two sets $A$ and $B$ are equal, $A = B$, iff (if and only if) $A \subseteq B$ and $B \subseteq A$.

We shall add that we consider only distinct objects when talking about elements of sets. Repeats are not allowed. Thus $\{2, 2\}$ is really the same set as $\{2\}$ and it is deemed to contain exactly one element.

Nor do we care about the order of elements in a set: $\{1, 2, 3\}$ and $\{3, 2, 1\}$, for instance, are deemed to be equal.

A set containing a single element, such as $\{5\}$, is called a singleton set. We distinguish sets from their elements. Thus $5 \in A$, whereas $\{5\} \notin A$. This is because $\{5\}$ is not the number five, it is a set containing the number five.

In the programming language Python there is a data structure called set, which works according to principles that mimic those of a mathematical set.


In [3]:
A = set([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

In [4]:
A


Out[4]:
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

In [5]:
set([2, 3, 4]).issubset(A)


Out[5]:
True

In [6]:
A.issubset(set([2, 3, 4]))


Out[6]:
False

In [7]:
set([2, 3, 4]) == A


Out[7]:
False

In [8]:
set([1, 2, 3]) == set([3, 2, 1])


Out[8]:
True

In [9]:
set([2, 2]) == set([2])


Out[9]:
True

In [10]:
set([2, 2])


Out[10]:
{2}

In [11]:
len(set([2, 2]))


Out[11]:
1

In [12]:
B = set([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 1, 1])

In [13]:
A.issubset(B)


Out[13]:
True

In [14]:
B.issubset(A)


Out[14]:
True

In [15]:
A == B


Out[15]:
True

In [16]:
len(B)


Out[16]:
10

In [17]:
set(['foo', 'bar', 'baz', 1, 2, 3, 4, 5, 1./7.])


Out[17]:
{0.14285714285714285, 'baz', 2, 1, 3, 4, 5, 'bar', 'foo'}

Proof by contradiction

When we were asked whether the set of positive rational numbers was denumerable, we simply enumerated its elements. We proved the statement "the set of rational numbers is denumerable" by actually finding — constructing — the requisite enumeration. Such a proof is known as a constructive proof: the existence of a mathematical object (in this case, the one-to-one correspondence between the natural numbers and the positive rational numbers) is demonstrated by creating or providing a method for creating the object.

Cantor's diagonal slash argument is an example of a different kind of proof — proof by contradiction or, in Latin, reductio ad absurdum.

In A Mathematician's Apology, G. H. Hardy described proof by contradiction as "one of a mathematician's finest weapons", saying

It is a far finer gambit than any chess gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers the game.

Here is another example. This one is due to Euclid (c. 300 BC).

Recall that a prime is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers. The first few primes are 2, 3, 5, 7, 11, 13, 17, 19, 23. Indeed, we can quickly come up with a list of primes using Python list comprehensions (in practice there are far more efficient algorithms for finding primes):


In [18]:
[x for x in range(2, 100) if all(x % y != 0 for y in range(2, x))]


Out[18]:
[2,
 3,
 5,
 7,
 11,
 13,
 17,
 19,
 23,
 29,
 31,
 37,
 41,
 43,
 47,
 53,
 59,
 61,
 67,
 71,
 73,
 79,
 83,
 89,
 97]

How do you prove that there are infinitely many primes, i.e. the set of all primes is infinite (obviously, countably infinite, since it is a subset of natural numbers, a countably infinite set)?

Since we are talking about proofs by contradiction, it may be sensible to assume that that's how we shall proceed. Assume for a contradiction that $P$, the set of all primes, is finite. Say, there exist exactly $n$ primes, $n \in \mathbb{N}$: $$P = \{p_1, p_2, p_3, \ldots, p_n\}.$$

Multiplying the elements of $P$ together, we obtain another number. Let us add 1 to that number to obtain $$a = p_1 p_2 p_3 \ldots p_n + 1.$$ Clearly this number is greater than any of the elements of $P$, so it is not in $P$. Since, by our assumption, $P$ contains all primes, $a$ is not a prime. Then there exists some prime in $P$, say, $p_k$, $1 \leq k \leq n$, that divides a. Then, $$a = p_k \cdot b, \quad b \in \mathbb{N}.$$ But then $$ \begin{align} 1 &= a - p_1 p_2 p_3 \ldots p_n \\ &= p_k \cdot b - p_1 \ldots p_{k-1} p_k p_{k+1} \ldots p_n \\ &= p_k (b - p_1 \ldots p_{k-1} p_{k+1} \ldots p_n). \end{align} $$ In other words, $p_k$ divides 1. Since no natural number other than 1 divides 1, we have a contradiction. Our assumption that $P$ is finite must have been wrong. There are therefore infinitely many primes. Q.E.D. (which signifies the end of the proof and is an initialism of the Latin phrase quod erat demonstrandum, "what was to be demonstrated").

Instead of Q.E.D., people sometimes put the Halmos symbol, □, at the end of the proof.

Here is yet another example: prove that $\sqrt{2}$ is not a rational number, i.e. that $\sqrt{2}$ is irrational.

Assume for a contradiction that $\sqrt{2}$ is rational. Then we can write it as $\sqrt{2} = \frac{p}{q}$, where $p$ and $q$ are natural numbers, $q \neq 0$. Further, we can assume that $\frac{p}{q}$ is a fraction in lowest terms, i.e. there is no prime that divides both $p$ and $q$. (Thus $\frac{3}{9}$ is not in lowest terms, since $3$ divides both $3$ and $9$, whereas $\frac{1}{3}$ is.)

Since $\sqrt{2} = \frac{p}{q}$, on squaring both sides, we obtain $2 = \frac{p^2}{q^2}$, hence $p^2 = 2q^2$. Since the right-hand side is even, the left-hand side must be even. Therefore $p$ is even, say $p = 2a$ for some $a \in \mathbb{N}$. But then $p^2 = 4a^2$, so $4a^2 = 2q^2$, whence $q^2 = 2a^2$, thus $q$ is even.

This contradicts our assumption that $\frac{p}{q}$ is a fraction in lowest terms. We have reached the contradiction. Therefore $\sqrt{2}$ cannot be written as a fraction in the form $\frac{p}{q}$. In other words, $\sqrt{2}$ is irrational. □

That $\sqrt{2}$ is irrational was discovered by Pythagoras and his followers, i.e. the Pythagoreans, in the 6th century BC. Prior to this discovery, people believed that all numbers were rational, i.e. could be expressed as simple fractions.

The discovery of irrational numbers is said to have been shocking to the Pythagoreans (as it violated their mystical worldview). They kept this discovery secret. Hippasus of Metapontum divulged this secret and is supposed to have drowned at sea, apparently as a punishment from the gods for divulging it.

Russell's paradox and Zermelo-Fraenkel set theory

Our treatment of set theory is very informal. In practice it was built out of axioms. Refer to Naïve set theory by Paul Halmos for a more detailed overview.

Now, suppose that we have a set containing all sets that are not elements of themselves. Symbolically, let $$R = \{x \, | \, x \notin x\}.$$ Is $R$ an element of itself?

If it were, $R \in R$, then we could substitute it for $x$ in "$x \, | \, x \notin x$", and so $R \notin R$.

If it weren't, then $R \notin R$, and, by definition, since $R$ is not an element of itself, it is in $R$, $R \in R$.

Thus we have a paradox: $R \in R \Leftrightarrow R \notin R$.

This particular paradox was discovered by Bertrand Russell in 1901 and bears his name, so it is called Russell's paradox.

This paradox confounded the so-called naïve set theorists, who did not know how to deal with it. Eventually, in 1908, Ernst Zermelo proposed an axiomatization of set theory that avoided the paradoxes of naïve set theory, which was eventually elaborated by Abraham Fraenkel, Thoralf Skolem, and Zermelo himself. The result became known as the Zermelo-Fraenkel set theory or ZFC. It is ZFC that remains the canonical axiomatic set theory to this day.

We won't go into the details of how Russell's paradox is resolved in ZFC, but we shall prefer to talk about collections or families of sets, rather than sets of sets.

Union and intersection

The union of two sets $A$ and $B$ is the set of elements which are in $A$, in $B$, or in both $A$ and $B$: $$A \cup B = \{x \, | \, x \in A \text{ or } x \in B\}.$$

We can check that, in Python,


In [19]:
A = set([3, 5, 7, 9])
B = set([1, 2, 3])
A.union(B)


Out[19]:
{1, 2, 3, 5, 7, 9}

The intersection of two sets $A$ and $B$ is the set of elements which are in $A$ and in $B$: $$A \cap B = \{x \, | \, x \in A \text{ and } x \in B\}.$$

In Python,


In [20]:
A = set([3, 5, 7, 9])
B = set([1, 2, 3])
A.intersection(B)


Out[20]:
{3}

Venn diagrams are helpful in visualising sets, including set unions and intersections (which are, of course, themselves sets).

The Python package matplotlib_venn is helpful in constructing them. It can be installed using easy_install matplotlib-venn.


In [21]:
from matplotlib_venn import venn2
v = venn2(subsets = (2, 2, 1));
v.get_label_by_id('01').set_text('')
v.get_patch_by_id('01').set_linewidth(2)
v.get_patch_by_id('01').set_edgecolor('black')
v.get_patch_by_id('01').set_facecolor('white')

v.get_label_by_id('10').set_text('')
v.get_patch_by_id('10').set_linewidth(2)
v.get_patch_by_id('10').set_edgecolor('black')
v.get_patch_by_id('10').set_facecolor('white')

v.get_label_by_id('11').set_text('$A \cap B$')
v.get_patch_by_id('11').set_linewidth(2)
v.get_patch_by_id('11').set_edgecolor('black')
v.get_patch_by_id('11').set_facecolor('#ff0000')



In [22]:
from matplotlib_venn import venn2
v = venn2(subsets = (2, 2, 1));
v.get_label_by_id('01').set_text('')
v.get_patch_by_id('01').set_linewidth(2)
v.get_patch_by_id('01').set_edgecolor('black')
v.get_patch_by_id('01').set_facecolor('#ff0000')

v.get_label_by_id('10').set_text('')
v.get_patch_by_id('10').set_linewidth(2)
v.get_patch_by_id('10').set_edgecolor('black')
v.get_patch_by_id('10').set_facecolor('#ff0000')

v.get_label_by_id('11').set_text('$A \cup B$')
v.get_patch_by_id('11').set_linewidth(2)
v.get_patch_by_id('11').set_edgecolor('black')
v.get_patch_by_id('11').set_facecolor('#ff0000')


John Venn (1834 - 1923) was an English logician and philosopher. He introduced the diagrams that would later bear his name in an 1880 paper entitled On the Diagrammatic and Mechanical Representation of Propositions and Reasonings.

There is a stained glass window at Gonville and Caius College, Cambridge, where Venn studied and worked, commemorating Venn and the Venn diagram.

The union and intersection may be extended to more than two sets. For example, let $G$ be the set of Greek uppercase letter glyphs, $$G = \{A, B, \Gamma, \Delta, E, Z, H, \Theta, I, K, \Lambda, M, N, \Xi, O, \Pi, P, \Sigma, T, Y, \Phi, X, \Psi, \Omega\},$$ $E$ be the set of English uppercase letter glyphs, $$E = \{A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z\},$$ and $R$ be the set of Russian uppercase letter glyphs, $$R = \{\text{А, Б, В, Г, Д, Е, Ё, Ж, З, И, Й, К, Л, М, Н, О, П, Р, С, Т, У, Ф, Х, Ц, Ч, Ш, Щ, Ъ, Ы, Ь, Э, Ю, Я}\}$$. Then their intersection is given by $$G \cap E \cap R = \{A, B, E, H, K, M, O, P, T, X, Y\}.$$

We can also take unions and intersections of infinitely many sets. Let $N_2$ denote the set of positive multiples of 2, $$N_2 = \{2, 4, 6, 8, 10, 12, \ldots\},$$ $N_3$ denote the set of positive multiples of 3, $$N_3 = \{3, 6, 9, 12, 15, 18, \ldots\},$$ and so on. Then we can write $$\mathbb{N} = \{1\} \cup \bigcup_{i=2}^{\infty} N_i.$$ Since here we have taken a union of countably many sets (we can enumerate $N_2, N_3, N_4, \ldots$), we refer to $\bigcup_{i=2}^{\infty} N_i$ as a countable union.

It is also possible to define uncountable, or arbitrary, unions and intersections.

Unions and intersections are connected by the following relations: $$ (A \cup B) \cap C = (A \cap C) \cup (B \cap C), \\ (A \cap B) \cup C = (A \cup C) \cap (B \cup C). $$

Set difference, De Morgan's laws

Another important operation on sets is the set difference, $$A \setminus B = \{x \, | \, x \in A \text{ and } x \notin B\}.$$

If we assume that all the sets that we are considering are subsets of some large set $\Omega$, we may write $A^{\complement}$, $A'$, or $\overline{A}$ instead of $\Omega \setminus A$ and refer to $\Omega \setminus A$ as the complement of $A$ (in $\Omega$).

The following relations, known as De Morgan's laws, so named after the 19-th century British mathematician Augustus De Morgan, play an important part in set theory: $$\overline{A \cup B} = \overline{A} \cap \overline{B},$$ and $$\overline{A \cap B} = \overline{A} \cup \overline{B}.$$

They can be generalized to arbitrary unions and intersections of infinitely many sets.

Here is an exercise for you: prove De Morgan's laws.

Assume that an element belongs to the set on the left-hand side and show that it also belongs to the set on the right-hand side (so the set on the left-hand side is a subset of the set on the right-hand side). Then assume that an element belongs to the set on the right-hand side and show that it also belongs to the set on the left-hand side (so the set on the right-hand side is a subset of the set on the left-hand side). If two sets are subsets of each other, then they are equal.

Cartesian products

The Cartesian product of two sets $A$ and $B$, written $A \times B$, is the set of all ordered pairs $(a, b)$ where $a \in A$ and $b \in B$.

For example, the Cartesian product of the sets $A = \{\text{foo}, \text{bar}, \text{baz}\}$ and $B = \{3, 12\}$ is the set $$A \times B = \{(\text{foo}, 3), (\text{bar}, 3), (\text{baz}, 3), (\text{foo}, 12), (\text{bar}, 12), (\text{baz}, 12)\}.$$

As another example, consider the 2-dimensional plane, the set of pairs $(x, y)$ with $x \in \mathbb{R}$, $y \in \mathbb{R}$.

More generally, Cartesian products can be defined for $n \in \mathbb{N}$ sets. For example, the 3-dimensional space, the set of triples $(x, y, z)$, $x \in \mathbb{R}$, $y \in \mathbb{R}$, $z \in \mathbb{R}$, is a 3-fold Cartesian product $\mathbb{R}^3 = \mathbb{R} \times \mathbb{R} \times \mathbb{R}$.

Functions

A binary relation R between a set $X$ (the set of departure) and a set $Y$ (the set of destination or codomain) is specified by its graph, $G$, which is a set of ordered pairs $(x, y)$, a subset of the cartesian product $X \times Y$. The binary relation is also known as a mapping or correspondence.

The statement $(x, y) \in G$ is read as "$x$ is $R$-related to $y$" and is denoted $xRy$ or $R(x, y)$. The order is important: $xRy$ does not necessarily imply $yRx$ for a particular binary relation $R$.

If $R$ is a binary relation between $X$ and $Y$, then the set $\{y \in Y \,|\, xRy \text{ for some } x \in X\}$ is called the image or range, of $R$. The set $\{x \in X \,|\, xRy \text{ for some } y \in Y\}$ is called the domain of $R$.

A function $f$ from a set $X$ to a set $Y$ (we sometimes write $f: X \rightarrow Y$) is a special case of a binary relation: it is defined by a set $G \subseteq X \times Y$ of ordered pairs $(x, y)$ such that, for each element $x \in X$ there corresponds one, and only one, pair $(x, y) \in G$, and we write $f(x) = y$ or $f: x \mapsto y$.

Suppose that $A = \{1, 2\}$, $B = \{a, b, c\}$.

Is the binary relation $f: A \rightarrow B$ defined by $G_f = \{(1, b), (2, a)\}$ a function?

*x**f(x)*
$1$$b$
$2$$a$

Indeed, it is a function: to each element of $x$, $f$ maps exactly one element of $B$.

Suppose that $A = \{1, 2\}$, $B = \{a, b, c\}$, as before.

Is the binary relation $g: A \rightarrow B$ defined by $G_g = \{(1, b), (1, a), (2, a)\}$ a function?

*x**g(x)*
$1$$b$
$1$$a$
$2$$a$

No, it isn't: two distinct elements of $B$, $a, b \in B$, are mapped to $1 \in A$, so $g$ is not a function by definition.

Suppose that $A = \{1, 2\}$, $B = \{a, b, c\}$, as before.

Is the binary relation $h: A \rightarrow B$ defined by $G_h = \{(2, a)\}$ a function?

*x**h(x)*
$2$$a$

No, it isn't: $h$ doesn't map any element of $B$ to $1 \in A$.

Suppose that $A = \{1, 2\}$, $B = \{a, b, c\}$, as before.

Is the binary relation $\alpha: A \rightarrow B$ defined by $G_{\alpha} = \{(1, a), (2, a)\}$ a function?

*x**$\alpha$(x)*
$1$$a$
$2$$a$

Indeed it is: to each element of $A$ there corresponds exactly one element of $B$.

Now, consider the binary relation $s: \mathbb{R} \rightarrow \mathbb{R}$ defined, for all $x \in \mathbb{R}$, by $s(x) = x^2$ or, in another notation, $s: x \mapsto x^2$:


In [23]:
import numpy as np
import matplotlib.pyplot as plt
xs = np.linspace(-10., 10., 50)
plt.plot(xs, [x*x for x in xs]);
plt.xlabel('x')
plt.ylabel('s(x)');


Is this a function?

Indeed, $s$ is a function: to each $x \in \mathbb{R}$ there corresponds one, and only one (thus exactly one) $s(x) \in \mathbb{R}$.

What about the binary relation $r: \mathbb{R} \rightarrow \mathbb{R}$ defined, for all $x \in \mathbb{R}$, by $r(x) = \sqrt{x}$?

First of all, this definition doesn't quite make sense. $\sqrt{\cdot}$ is defined only for nonnegative real numbers.

Suppose we redefine $r: \mathbb{R} \rightarrow \mathbb{R}$: for all $x \geq 0$, $r(x) = \sqrt{x}$.

Is this now a function?

It's not a function $r: \mathbb{R} \rightarrow \mathbb{R}$, since it does not define a value $f(x)$ for all $x \in \mathbb{R}$, only for those $x$ that are nonnegative.

How could we fix the definition of $r$ so that it is a function?

One way to do it is to define it as $r: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}$, where $\mathbb{R}_{\geq 0}$ is the set of nonnegative real numbers.


In [24]:
xs = np.linspace(-10., 10., 50)
xs = [x for x in xs if x >= 0]
plt.plot(xs, [np.sqrt(x) for x in xs]);
plt.xlabel('x')
plt.ylabel('s(x)');


Image and inverse image

Let $f: X \rightarrow Y$ be a function and $A \subseteq X$. Then the image $f[A]$ of $A$ under $f$ is the set $$f[A] = \{y \in Y \,|\, y = f(x) \text{ for some } x \in A\}.$$

This is consistent with the definition of the image of a function given above: the image of the function $f$ is the image $f[X]$ of the entire set $X$.

Let $f: X \rightarrow Y$ be a function and $B \subseteq Y$. Then the preimage or inverse image of $B$ under $f$ is the set $$f^{-1}[B] = \{x \in X \,|\, f(x) \in B\}.$$

For example, for $f: \mathbb{R} \rightarrow \mathbb{R}$, $f: x \mapsto x^2$, the inverse image of the singleton set $\{4\}$ is the set $\{-2, 2\}$. The inverse image of the set $\{4, 9\}$ is the set $\{-3, -2, 2, 3\}$.

The inverse image of the union of two sets is equal to the union of their inverse images: $$f^{-1}[A \cup B] = f^{-1}[A] \cup f^{-1}[B].$$

Exercise: How would you prove this?

The inverse image of the intersection of two sets is equal to the intersection of their inverse images: $$f^{-1}[A \cap B] = f^{-1}[A] \cap f^{-1}[B].$$

The image of the union of two sets is equal to the union of their images: $$f[A \cup B] = f[A] \cup f[B].$$

Is the image of the intersection of two sets equal to the intersection of their images: $$f[A \cap B] \overset{?}{=} f[A] \cap f[B].$$

The image of the intersection of two sets is, in general, not equal to the intersection of their images: $$f[A \cap B] \neq f[A] \cap f[B].$$

To see this, consider $f: \mathbb{R} \times \mathbb{R} \rightarrow \mathbb{R}$, defined by $f: (x, y) \mapsto x$, a projection on the $x$-plane.

Define $A = \{(x, 0) \,|\, 0 \leq x \leq 1\}$ and $A = \{(x, 1) \,|\, 0 \leq x \leq 1\}$.

The two sets do not intersect, or, in other words, their intersection is the so-called empty set: $A \cap B = \{\} = \emptyset$.

However, the images of the two sets coincide: $f(A) = f(B) = \{x \,|\, 0 \leq x \leq 1\}$.

We have just disproven $$f[A \cap B] = f[A] \cap f[B],$$ equivalently, we have just proven $$f[A \cap B] \neq f[A] \cap f[B].$$ by producing a particular counterexample — this is yet another proof method, very common in mathematics. □

One-to-one, onto, bijections, and inverse functions

A function $f: X \rightarrow Y$ is injective or one-to-one if each possible element $y \in Y$ of its codomain $y$ is mapped to by at most one argument $x \in X$. We call injective functions injections.

It is surjective or onto if each possible element $y \in Y$ is mapped to by at least one argument $x \in X$. We call surjective functions surjections.

It is bijective if it is both injective and surjective (one-to-one and onto). We call such functions bijections or one-to-one correspondences.

A function $f: X \rightarrow Y$ is invertible if there exists a function $g: Y \rightarrow X$ such that, for all $y \in Y$, $g(f(x)) = x$. We call $g$ an inverse of $f$ and sometimes denote it by $f^{-1}$.

One can check that a function is invertible iff it is a bijection.

Suppose that $A = \{1, 2\}$, $B = \{a, b, c\}$.

Is the function $f: A \rightarrow B$ defined by $G_f = \{(1, b), (2, a)\}$ one-to-one, onto, a bijection, invertible?

*x**f(x)*
$1$$b$
$2$$a$
  • It is one-to-one, since for each element it its image, $f[A]$, there corresponds a single element of $A$: to $a$, there corresponds 2, and to $b$, there corresponds 1.

  • It is not onto, since no element of $A$ is mapped to $c \in B$.

  • Since the function is not one-to-one and onto, but only one-to-one, it is not a bijection.

  • Therefore it is not invertible. Indeed we could not define the inverse function $f^{-1}: B \rightarrow A$ since we wouldn't be able to map $c \in B$ to anything — it wouldn't be a function.

As before, $A = \{1, 2\}$, $B = \{a, b, c\}$.

Is the function $\alpha: A \rightarrow B$ defined by $G_{\alpha} = \{(1, a), (2, a)\}$ one-to-one, onto, a bijection, invertible?

*x**$\alpha$(x)*
$1$$a$
$2$$a$
  • It is not one-to-one, since for $a \in f[A]$, there correspond two elements of $A$: 1 and 2: $f(1) = f(2) = a$.
  • It is not onto, since no element of $A$ is mapped to $b \in B$, nor is there an element of $A$ mapped to $c \in B$.
  • Since the function is not one-to-one and onto, in fact, it is neither, it is not a bijection.
  • Therefore it is not invertible.

In fact, is there any bijection (and therefore any invertible function) between $A = \{1, 2\}$ and $B = \{a, b, c\}$?

Since $|A| \neq |B|$ there isn't!

But between $C = \{1, 2\}$ and $D = \{a, b\}$ there are two bijections. One is $\beta: C \rightarrow D$ defined by $G_{\beta} = \{(1, a), (2, b)\}$:

*x**$\beta$(x)*
$1$$a$
$2$$b$

Its inverse is $\beta^{-1}: D \rightarrow C$ defined by $G_{\beta^{-1}} = \{(a, 1), (b, 2)\}$:

*y**$\beta^{-1}$(y)*
$a$$1$
$b$$2$

The other bijection between $C = \{1, 2\}$ and $D = \{a, b\}$ is $\gamma: C \rightarrow D$ defined by $G_{\gamma} = \{(1, b), (2, a)\}$:

*x**$\gamma$(x)*
$1$$b$
$2$$a$

Its inverse is $\gamma^{-1}: D \rightarrow C$ defined by $G_{\gamma^{-1}} = \{(a, 2), (b, 1)\}$:

*y**$\gamma^{-1}$(y)*
$a$$2$
$b$$1$

Since $\gamma$ is a bijection, so is $\gamma^{-1}$, and it is therefore invertible; the inverse of $\gamma^{-1}$ is $\gamma$.

How natural numbers can be constructed from sets

We have mentioned that sets are the "most general" mathematical objects. But then we informally introduced other objects, such as ordered pairs. At first sight, ordered pairs are different from sets. Why do we then say that sets are the "most general" mathematical objects?

In fact, ordered pairs can be expressed as sets. There are several ways to do this, one of them the so-called Kuratowski's definition proposed in 1921 by Kazimierz Kuratowski: $$(a, b) = \{\{a\}, \{a, b\}\}.$$

This definition can be used even when the two elements of the pair are identical: $$(a, a) = \{\{a\}, \{a, a\}\} = \{\{a\}, \{a\}\} = \{\{a\}\}.$$

Triples can be defined as nested pairs: $$(a, b, c) = (a, (b, c)),$$ and so on.

But what about numbers? Surely there is no way to use sets to define numbers?

In fact, we can define natural numbers and zero as follows: $$ 0 = \{\} = \emptyset, \\ 1 = \{0\} = \{\emptyset\}, \\ 2 = \{0, 1\} = \{\emptyset, \{\emptyset\}\}, \\ 3 = \{0, 1, 2\} = \{\emptyset, \{\emptyset\}, \{\emptyset, \{\emptyset\}\}\}, \\ \vdots $$

This definition is part of the Zermelo-Fraenkel (ZF) set theory. We can then use the Dedekind-Peano axioms to define arithmetic for natural numbers in terms of set theory.

Having constructed $\mathbb{N}$, we can then construct the rationals $\mathbb{Q}$, the reals $\mathbb{R}$, building on the foundation of set theory.

Proceeding onwards, we obtain... the rest of mathematics!