2. Metric Spaces

2.1. Basic Definitions

2.1.1. A metric space is a collection of points with a measure of distance between each pair of two points. Formally, a metric space is a set $X$ equipped with a distance function $d:X \times X \to [0,\infty)$ such that

  • $d$ adheres to the principle of the identity of the indiscernables, i.e., $d(x, y) = 0$ if and only if $x = y$,
  • $d$ is symmetric, i.e., $d(x,y) = d(y,x)$ for all $x,y \in X$, and
  • $d$ satisfies the triangle inequality, i.e., $d(x,z) \leq d(x,y) + d(y,z)$ for all $x,y,z \in X$.

Metric spaces provide a convenient abstraction that encompasses many important mathematical objects, such as the Euclidean space, the Hilbert space, (normed) function spaces, weighted graphs, and so on.

2.1.2. Metric spaces are also particularly well-behaved examples of topological spaces, which abstract the notion of nearness. The building blocks of all topological arguments are open sets: a subset $U$ of a metric space $(X, d)$ is said to be open in $X$ if, for each $x \in X$, there exists $\epsilon > 0$ such that the open ball of radius $\epsilon$ centered at $x$

$$B(x; \epsilon) = \{y \in X : d(x,y) < \epsilon\}$$

is a subset of $U$. A subset of $(X, d)$ is said to be closed in $X$ if its set complement in $X$ is open.

We observe that $X$ and $\varnothing$ are both open and closed in $X$. If no other subset of $X$ is both open and closed in $X$, we say that $(X, d)$ is a connected metric space.

2.1.3. Given a subset $Z$ of a metric space $(X, d_X)$, we can define the induced metric $d_Z$ in $Z$ to be the restriction $d_Z = d_X \vert_{Z \times Z}$. The resulting metric space $(Z, d_Z)$ is called a metric subspace of $(X, d_X)$.

2.1.4. In discussing open and closed sets, it is crucial that we specify the underlying metric space. Consider, for example, the real line $\mathbb{R}$ with the usual Euclidean metric

$$d(x, x') = \sqrt{\vert x - x' \vert^2}.$$

A singleton set in $\mathbb{R}$ is never open in $\mathbb{R}$, as no open interval can fit into a singleton set. Nevertheless, $\{0\}$ with the metric induced from the Euclidean metric has the singleton set $\{0\}$ as an open subset.

2.2. Limits and Continuity

2.2.1. The $\epsilon$-$\delta$ definition of limit and continuity generalizes to metric spaces. Indeed, we say that $L \in Y$ is the limit of a function $f:(X,d_X) \to (Y,d_Y)$ at $x_0 \in X$ if, for each $\epsilon > 0$, there exists a $\delta > 0$ such that

$$d_Y\left(f(x_0),f(x)\right) < \epsilon \hspace{1em}\mbox{whenever}\hspace{1em}d_X(x_0, x) < \delta.$$

$f$ is said to be continuous at $x_0 \in X$ if

$$f(x_0) = \lim_{x \to x_0} f(x).$$

2.2.2. It is also possible to generalize the notion of the limit of a sequence. We say that a sequence $(x_n)_{n=0}^\infty$ of points in $(X,d)$ converges to $L \in X$ in case, for each $\epsilon > 0$, there exists an integer $N_\epsilon$ such that

$$d(x_n, L) < \epsilon$$

whenever $n \geq N_\epsilon$. The definition guarantees that each sequence has no more than one limit. We remark that a subset $E$ of a metric space $(X,d_X)$ is closed if and only if every convergent sequence in $X$ whose terms are elements of $E$ has its limit in $E$ as well.

2.2.3. It is a standard fact in metric space theory that the continuous image of a connected set is connected. Formally, if $f:(X,d_X) \to (Y,d_Y)$ is continuous, and if $(X,d_X)$ is connected, then $\operatorname{im} f$ with the induced metric from $Y$ is also a connected metric space. If $X = Y = \mathbb{R}$ with the standard Euclidean distance

$$d(x,x') = \sqrt{\vert x - x' \vert^2},$$

then the only connected subsets are the intervals. Therefore, continuous functions map intervals to intervals. In particular, the function takes all intermediate values, a result commonly known as the intermediate value theorem.

2.3. Complete Metric Spaces

2.3.1. Often, we are interested in metric spaces without holes, in that sequences that should have a limit have a limit. Formally, a Cauchy sequence in $(X, d)$ is a sequence $(x_n)_{n=0}^\infty$ of points in $X$ such that, for each $\epsilon > 0$, there exists an integer $N_\epsilon$ such that

$$d(x_n, x_m) < \epsilon$$

whenever $n,m \geq N_\epsilon$. A metric space $(X, d)$ is said to be complete if every Cauchy sequence converges to a limit.

2.3.2. Completeness is a technical condition that allows many limit arguments to go through smoothly. For example, the intermediate value theorem holds on $\mathbb{R}$, but not on $\mathbb{Q}$.

Most metric spaces we shall work with in this course will be complete. For now, we merely remark that the $n$-dimensional Euclidean space $\mathbb{R}^n$ with the standard Euclidean metric

$$d(\vec{v}, \vec{w}) = \sqrt{\| \vec{v} - \vec{w} \|^2} = \sqrt{\sum_{i=1}^n \vert v^i - w^i \vert^2}$$

is complete.

2.4. Compact Metric Spaces

2.4.1. We are also interested in metric spaces that retain key properties of finite spaces. As Terence Tao points out, every function $f:X \to \mathbb{R}$ on a finite set $X$ satisfies the following properties:

  • $f$ is bounded, i.e., there exists a real number $M$ such that $\vert f(x) \vert \leq M$ for all $x \in X$, and
  • $f$ attains its maximum, i.e., there exists a point $x_0 \in X$ such that $f(x_0) \geq f(x)$ for all $x \in X$.

Moreover, every sequence $(x_n)_{n=0}^\infty$ in $X$ has an eventually constant subsequence, i.e., there exists a subsequence $(x_{n_k})_{k=0}^\infty$ and an integer $N$ such that $x_{n_K} = x_{n_k}$ for all $k \geq K$.

These properties make the analysis of functions on a finite set quite simple.

2.4.2. The metric-space generalization of finiteness is compactness, which, for a metric space $(X,d)$ is defined to be the condition that every open cover admits a finite subcover. In other words, if $\{U_\alpha\}_\alpha$ is a collection of open subsets of $X$ such that

$$\bigcup_\alpha U_\alpha = X,$$

then there is a finite subcollection $\{U_0,\ldots,U_{N-1}\}$ of $\{U_\alpha\}_\alpha$ such that

$$\bigcup_{n=0}^{N-1} U_n = X.$$

All finite metric spaces are compact, of course.

2.4.3. The Heine–Borel theorem states that a subset of the Euclidean $n$-space $\mathbb{R}^n$ with the usual Euclidean metric

$$d(\vec{v},\vec{w}) = \sqrt{\|\vec{v}-\vec{w}\|^2} = \sqrt{\sum_{i=1}^n \vert v^i - w^i \vert^2}$$

is compact if and only if it is closed and bounded. In general, a metric space is compact if and only if it is complete and totally bounded. It follows, in particular, that all compact metric spaces are complete (§2.3).

We let $(K,d_k)$ be a metric space.

Firstly, every continuous (§2.2.1) function $f:(K,d_K) \to \mathbb{R}$ is bounded. The crucial fact here is that the image of a compact metric space under a continuous mapping is compact. By the Heine–Borel theorem, every compact subset of the Euclidean space is bounded, and so the image set is bounded as well.

Secondly, every continuous function $f:(K,d_k) \to \mathbb{R}$ attains its maximum. Once again, the image of a compact metric space under a continuous mapping is compact, and so $\operatorname{im} f$ is a compact subset of $\mathbb{R}$. By the Heine&ndash&Borel theorem (§2.4.2), $\operatorname{im} f$ is closed and bounded. Since $\operatorname{im} f$ is bounded, the supremum $M$ of $\operatorname{im} f$ exists. Finally, $\operatorname{im} f$ is closed, and so $M$ is contained in $\operatorname{im} f$ (§2.2.2). Therefore, $M$ is the maximum of $f$, and $f$ attains its maximum. This is the metric-space generalization of the extreme value theorem.

Lastly, every sequence $(x_n)_{n=0}^\infty$ in $(K,d_k)$ has a convergent subsequence, a result known as the Bolzano–Weierstrass theorem. The proof makes crucial use of the so-called infinite pigeonhole principle: at least one partition in an infinite set must contain infinitely many points.

We remark that compactness is a useful technical condition in the study of a machine learning tool known as kernel methods, via Mercer's condition.