The Axiom of Choice

A hopefully less daunting, and somewhat technical walkthrough.

The axiom of choice is a fundamental mathematical assumption that has far reaching, and counterintuitive consequences. It is also a source of much philosophical debate and kerfuffle over the foundations of mathematics.

Despite this, I have never really come across what I would consider a clear and satisfactory explanation. This is even after doing a PhD in applied mathematics! On the other hand, there are many videos and articles, aimed at a less technical audience. But I feel that they tend to skip over too much.

Like the subtitle implies, I am going to try and provide an explanation that is as complete as possible, with as little mathematics as possible. For readers who are interested in the full gory technical details, the Stanford Encyclopedia of Philosophy has a great write up.

Set Theory and Russell's Paradox

The axiom of choice is an axiom in set theory. This means we first need to talk about set theory, and define what a set is, before we can proceed.

Axiomatic set theory is often regarded as the standard foundation of mathematics. This is a list of rules that mathematicians use when working with "sets". All mathematical objects are sets, and these objects can be gathered together to form new sets. For example, each whole number can be viewed as a set, and we can gather all the even ones together, to form the set of even numbers.

Mathematicians were working with sets before they were axiomatized by Zermelo and Fraenkel. However, it was discovered that if we did not have a list of rules that carefully state what is allowed to be a set, we run into problems like the famous Russell's Paradox, which goes like this.

Point 3 is very similar to the sentence "this statement is false", which can neither be true or false, as both cases lead to a contradiction.

Choice Functions

Now that we have the concept of a set, we can go on to talk about choice functions. These are what the word "choice" in the name of the axiom refers to. To define what a choice function is, we first start with a collection \( X \) of non-empty sets. \[ X = \{ X_i \, \, | \, \, \forall i \in I, X_i \neq \varnothing \}. \]

Let \( \bigcup X \) be the union of all sets in collection \( X \). Then, a choice function on \( X \) is a mapping \( f : X \rightarrow \bigcup X \) such that, \[ \forall X_i \in X, \, f(X_i) \in X_i. \]

Intuitively, given a set \( X_i \), the choice function \( f \) chooses an element \( f(X_i) \) from the set \( X_i \).

The Axiom of Choice

Now we can formally state the axiom of choice.

\[ \text{Any collection of non-empty sets } X \text{ has a choice function } f \text{ that is defined on } X. \]

I find that the axiom is much easier to explain with an equivalent formulation of the axiom, via the cartesian products of sets.

Cartesian Product of Sets

Suppose we have two sets \( X_1 \) and \( X_2 \). We can take their Cartesian product,

\[ X_1 \times X_2 = \{ (x_1,x_2) \mid x_1 \in X_1, \, x_2 \in X_2 \}. \]

Elements of this product can be represented by 2-tuples. Similarly, for an arbitrary number of sets \( \{ X_i \, \, | \, \, i \in I \} \), their Cartesian product is, \[ \prod_{i \in I} X_i = \{ f : I \rightarrow \bigcup_{i \in I} X_i \, \mid \, \forall i, \, f(i) \in X_i \}. \] Elements of this more general definition are functions, and not just tuples.

The Axiom of Choice Again

With that, we can state the equivalent version of the axiom of choice using cartesian products.

\[ \text{The cartesian production of any collection of non-empty sets } X \text{ is nonempty. } \]

Now, let \( X = \{ X_i \, \mid \, i \in I \} \) for some index set \( I \). Then, the cartesian product of the sets in \( X \) is \[ \prod_{i \in I} X_i = \{ f : I \rightarrow \bigcup_{i \in I} X_i \, \mid \, \forall i \, , \, f(i) \in X_i \}. \] If this cartesian product is nonempty, then we can find at least one function \( f : I \rightarrow \bigcup_{i \in I} X_i \), such that for all input values \( i \), the output \( f(i) \) is an element in \( X_i \).

This is essentially a way to "choose" an element from each \( X_i \). In fact, this version of the axiom can be shown to be equivalent to the previous one.

The Crux of the Problem

Axiomatic set theory allows us to have nonempty sets that contain elements. Given one of these nonempty set, \( X \), we can always "choose" an element from it with these steps:

However, suppose that instead of a single nonempty set, we have a collection containing a countably infinite number of sets, \( X = \{ X_1, X_2, ... \} \).

We can attempt to apply the same reasoning to choose an element from \( X_1 \). Then repeat to choose an element from \( X_2 \) and so on. We can keep repeating until we reach some set \( X_n \). This allows us to explicitly construct an element of \( X_1 \times X_2 \times ... \times X_n \), and hence demonstrate that it is nonempty.

But here lies the crux of the problem: while we can do this for any finite \( n \), there is no way to do this for all of the infinite number of sets in \( X \). This is the crucial reason why the axiom of choice has to be an axiom.

On the other hand, assuming that the axiom of choice is true leads to counterintuitive consequences, such as the infamous Banach-Tarski paradox, which allows us to split an object into two copies that are identical to the original object.

Application in Linear Algebra

This section is extremely complex, and is not required to understand the axiom of choice. Please feel free to skip it. Before we end this article, I would like to highlight an interesting application of the axiom of choice in linear algebra: it can be used to prove that every vector space has a basis.

The key part of the proof applies Zorn's Lemma, which is equivalent to the Axiom of Choice. A very brief summary of the proof goes like this.

\( L \) is linearly independent and spans \( V \). So, \( L \) is a basis for \( V \).