The Axiom of Choice
A hopefully less daunting, and somewhat technical walkthrough.
The axiom of choice is a fundamental mathematical assumption that has far reaching, and counterintuitive consequences. It is also a source of much philosophical debate and kerfuffle over the foundations of mathematics.
Despite this, I have never really come across what I would consider a clear and satisfactory explanation. This is even after doing a PhD in applied mathematics! On the other hand, there are many videos and articles, aimed at a less technical audience. But I feel that they tend to skip over too much.
Like the subtitle implies, I am going to try and provide an explanation that is as complete as possible, with as little mathematics as possible. For readers who are interested in the full gory technical details, the Stanford Encyclopedia of Philosophy has a great write up.
Set Theory and Russell's Paradox
The axiom of choice is an axiom in set theory. This means we first need to talk about set theory, and define what a set is, before we can proceed.
Axiomatic set theory is often regarded as the standard foundation of mathematics. This is a list of rules that mathematicians use when working with "sets". All mathematical objects are sets, and these objects can be gathered together to form new sets. For example, each whole number can be viewed as a set, and we can gather all the even ones together, to form the set of even numbers.
Mathematicians were working with sets before they were axiomatized by Zermelo and Fraenkel. However, it was discovered that if we did not have a list of rules that carefully state what is allowed to be a set, we run into problems like the famous Russell's Paradox, which goes like this.
- Let \( X \) be defined as the set of all sets that do not contain itself.
- If \( X \) contains itself, then it violates our definition, which would require it to not contain itself.
- So, we are stuck with the unpleasant situation where \( X \) cannot contain itself, and must also contain itself.
Choice Functions
Now that we have the concept of a set, we can go on to talk about choice functions. These are what the word "choice" in the name of the axiom refers to. To define what a choice function is, we first start with a collection \( X \) of non-empty sets. \[ X = \{ X_i \, \, | \, \, \forall i \in I, X_i \neq \varnothing \}. \]
The Axiom of Choice
Now we can formally state the axiom of choice.
Cartesian Product of Sets
Suppose we have two sets \( X_1 \) and \( X_2 \). We can take their Cartesian product,
The Axiom of Choice Again
With that, we can state the equivalent version of the axiom of choice using cartesian products.
This is essentially a way to "choose" an element from each \( X_i \). In fact, this version of the axiom can be shown to be equivalent to the previous one.
The Crux of the Problem
Axiomatic set theory allows us to have nonempty sets that contain elements. Given one of these nonempty set, \( X \), we can always "choose" an element from it with these steps:
- The definition of a nonempty set means that \( \exists x \) such that \( x \in X \).
- The rule of existential instantiation allows us to refer to an element \( y \) satisfying \( y \in X \), without actually knowing what \( y \) is or how we got it.
- A choice function defined on \( X \) is then given by \( f(X) = y \).
We can attempt to apply the same reasoning to choose an element from \( X_1 \). Then repeat to choose an element from \( X_2 \) and so on. We can keep repeating until we reach some set \( X_n \). This allows us to explicitly construct an element of \( X_1 \times X_2 \times ... \times X_n \), and hence demonstrate that it is nonempty.
But here lies the crux of the problem: while we can do this for any finite \( n \), there is no way to do this for all of the infinite number of sets in \( X \). This is the crucial reason why the axiom of choice has to be an axiom.
On the other hand, assuming that the axiom of choice is true leads to counterintuitive consequences, such as the infamous Banach-Tarski paradox, which allows us to split an object into two copies that are identical to the original object.
Application in Linear Algebra
This section is extremely complex, and is not required to understand the axiom of choice. Please feel free to skip it. Before we end this article, I would like to highlight an interesting application of the axiom of choice in linear algebra: it can be used to prove that every vector space has a basis.
The key part of the proof applies Zorn's Lemma, which is equivalent to the Axiom of Choice. A very brief summary of the proof goes like this.
- Let \( X \) be the set of all linearly independent subsets of a vector space \( V \).
- \( X \) contains the empty set, and so is nonempty. It is partially ordered by inclusion \( \subseteq \).
- Every chain (totally ordered subsets) in \( X \) has an upper bound (union of all elements in the chain).
- Every finite subset of this union has to be linearly independent. Hence, the upper bound is in \( X \).
- Apply Zorn's Lemma to get a maximal element \( L \in X \).
- \( L \) must span \( X \). If not, there will be a \( y \in X \) so that \( L \subset L \cup \{ y \} \).