The Axiom of Choice

A hopefully less daunting, but still somewhat technical walkthrough.

Sections

Too Long; Didn’t Read
Axiomatic Set Theory
Choice Functions
Another Version of the Axiom of Choice
Cartesian Product of Sets
Why Must It Be An Axiom?

The axiom of choice is a fundamental mathematical assumption that has far-reaching and counterintuitive consequences. It is also a source of much philosophical debate and kerfuffle over the foundations of mathematics.

Despite this, I have never encountered a clear and satisfactory explanation. This is even after doing a PhD in mathematics! There are many videos and articles aimed at less technical audiences, but I feel that they tend to skip over too much.

As the subtitle implies, I am going to try to provide an explanation that is as complete as possible, with as little mathematics as possible. For readers who are interested in the full gory technical details, the Stanford Encyclopedia of Philosophy has a great write-up.

Too Long; Didn’t Read

If you already have a good understanding of the topic or are just impatient, this section is a “Too Long; Didn’t Read” (TL;DR) version. The axiom of choice is:

\[ \text{The Cartesian product of any collection } X \text{ of non-empty sets is non-empty. } \]

“Why would we ever expect the Cartesian product of non-empty sets to be empty?” is probably the reason why some people think the axiom is obvious. On the other hand, assuming that the axiom of choice is true leads to counterintuitive consequences, such as the infamous Banach-Tarski paradox, which allows us to split an object into two identical copies of the original object.

You might have noticed that the word “choice” is conspicuously missing in my definition. I believe that stating the axiom in terms of Cartesian products is the clearest way to illustrate why the axiom should be true. “Choice” will make an appearance soon.

Axiomatic Set Theory

The axiom of choice is an axiom in set theory. So, we should probably talk a little about set theory.

For this article, when we say set theory or axiomatic set theory, we are referring to Zermelo–Fraenkel set theory, which is often regarded as the standard foundation of mathematics. This is a list of rules that mathematicians use to define and work with objects known as “sets”. Technically, all mathematical objects are sets, and these objects can be gathered together to form new sets. For example, each whole number can be viewed as a set. We can gather all the even ones together to form the set of even whole numbers.

Mathematicians were working with sets before they were axiomatized. However, it was discovered that if we did not have a list of rules that carefully state what is allowed to be a set, we run into problems like the famous Russell’s Paradox. This paradox is not necessary for understanding the axiom of choice, but I wrote about it anyway and left it in the overflow section.

Choice Functions

Now that we have the concept of a set, we can go on to talk about choice functions. These choice functions are what “choice” in the axiom of choice is referring to. To define what a choice function is, we start with a collection \( X \) of non-empty sets indexed by \( I \),

\[ X = \{ X_i \, \, | \, \, \forall i \in I, X_i \neq \varnothing \}. \]

Let \( \bigcup X \) be the union of all sets in collection \( X \). Then, a choice function on \( X \) is a mapping \( f : X \rightarrow \bigcup X \) such that

\[ \forall X_i \in X, \, f(X_i) \in X_i. \]

Intuitively, the choice function \( f \) “chooses” an element \( f(X_i) \) from the set \( X_i \).

Another Version of the Axiom of Choice

Now we can state a version of the axiom of choice that does contain the word “choice”:

\[ \text{Any collection } X \text{ of non-empty sets has a choice function } f \text{ that is defined on } X. \]

In my humble opinion, people usually fumble when trying to explain this version and get confused by the choice function. It also does not make the axiom look as obvious. Still, choice functions will be a great tool later on, for explaining why the axiom of choice must be an axiom.

Cartesian Product of Sets

The Cartesian product version of the axiom of choice has already been stated above, but here it is again:

\[ \text{The Cartesian product of any collection } X \text{ of non-empty sets is non-empty. } \]

But now we actually define the Cartesian product. For two sets \( X_1 \) and \( X_2 \), it is

\[ X_1 \times X_2 = \{ (x_1,x_2) \mid x_1 \in X_1, \, x_2 \in X_2 \}. \]

Elements of product \( X_1 \times X_2 \) can be represented by \(2\)-tuples. Similarly, for an arbitrary number of sets \( \{ X_i \, \, | \, \, i \in I \} \), their Cartesian product is

\[ \prod_{i \in I} X_i = \{ f : I \rightarrow \bigcup_{i \in I} X_i \, \mid \, \forall i, \, f(i) \in X_i \}. \]

Elements of this more general definition are functions and not just tuples.

If this Cartesian product is nonempty, then we can find at least one function \( f : I \rightarrow \bigcup_{i \in I} X_i \) such that for all input values \( i \), the output \( f(i) \) is an element in \( X_i \).

This is essentially a way to “choose” an element from each \( X_i \), and why the two ways I have stated the axiom are equivalent. An explicit mathematical proof of their equivalence can be found in this Stack Exchange answer.

Why Must It Be an Axiom?

One look at the Cartesian product version of the axiom of choice, and it is clear why it should be true. Why would the Cartesian product of nonempty sets ever be empty?

However, it is known that the axiom of choice is, in fact, independent of the other set theory axioms. So, we cannot derive the axiom of choice from them. This forces us to have to adopt it as an axiom if we want to use it. Also, the axiom of choice is not the only statement that is independent of the other axioms.

Specific Cases & Points of Failure

One important thing to note is that the axiom of choice is a general statement about any collection of non-empty sets. We cannot use the other axioms to prove that this statement holds generally.

However, there are specific cases where we will be able to use the other axioms to get a choice function and prove that the Cartesian product is nonempty, thus bypassing the axiom of choice in that particular case. It might be illuminating to go through a few specific cases to see where the failure points are.

A Single Set

Let’s start by trying to choose an element from a single nonempty set \( X_i \).

Axiomatic set theory allows us to have nonempty sets that contain elements.
The definition of a nonempty set means that \( \exists x \) such that \( x \in X_i \).
The rule of existential instantiation allows us to refer to an element \( y \) satisfying \( y \in X_i \), without actually knowing what \( y \) is or how we got it.
So, we can explicitly define a choice function on \( X_i \) as \( f(X_i) = y \).

Finite Collections of Sets

Now, suppose we have a finite collection of sets \( X = (X_1, X_2, ..., X_n) \).

We can apply the same reasoning for a single set to choose an element \( x_1 \) from \( X_1 \).
Then, repeat it again to choose an element \( x_2 \) from \( X_2 \), and so on. We can keep repeating until we reach set \( X_n \).
This allows us to explicitly construct an element \( (x_1, x_2, ..., x_n) \) of the Cartesian product \( X_1 \times X_2 \times ... \times X_n \) and hence demonstrate that it is nonempty.

Infinite Collections of Sets

Now, suppose we have a countably infinite collection of arbitrary sets \( X = (X_1, X_2, ...) \). We can again try to repeat the previous method for any finite \( n \). But we will run into an important point of failure: there is no way to do this, using the axioms, for all of the countably infinite number of sets in \( X \). There is no way to “reach the end”.

This situation is reminiscent of supertasks. See the overflow section for more details about that. As you can imagine, the situation is even worse if we have an uncountably infinite collection of arbitrary sets. However, even with these dire conditions, there are specific cases where we can prove that the Cartesian product is nonempty!

If we can define a binary relation with certain properties, we can use the Axiom Scheme of Replacement to prove that the choice function exists without using the axiom of choice and subsequently show that the Cartesian product is nonempty. I walk through a case where this can be done in the relevant part of the overflow section.