Axiom of Choice and Zorn’s Lemma

The Axiom of Choice (AC) asserts that for all sets {\mathcal{A}} of nonempty sets, there is a choice function {f:\mathcal{A} \rightarrow \bigcup\mathcal{A}} such that for all {B \in \mathcal{A}}, {f(B) \in B}.

Zorn’s lemma says that for every partially ordered set {(A, \leq)} such that every chain in {A} has an upper bound in {A}, there is a maximal element of {A}.

The primary aim is to prove that the Axiom of Choice implies Zorn’s Lemma. For this we use the notion of wellordered sets, which are just linearly ordered sets {A} such that every nonempty subset of {A} has a least element.

To gain some intuition about the proof, let us instead try to define a wellordering of a given set, assuming the Axiom of Choice. For a given set {A}, consider a choice function for the set of all nonempty subsets of {A}. Then one could try to define a wellordering of {A} as follows: pick {x_{0}=f(A)} as the first element, {x_{1} = f(A\setminus \{x_{0}\})} as the second, {x_{2} = f(A\setminus \{x_{0},x_{1}\})} as the third, and so on. But all the {x_{i}}‘s might not exhaust all of {A}, so one might have to extend this process of building a wellorder by choosing the element following all the {x_{i}}‘s to be {x_{\omega} = f(A \setminus \{x_{0}, x_{1}, \ldots\})}. Then we continue with {x_{\omega+1}, x_{\omega+2}}, and so on. But even this might not exhaust all of {A} and we have to “jump to another limit point” and so on, to form the desired wellorder by a transfinite procedure. But we do not wish to build the machinery (ordinals) necessary to describe such a procedure. And indeed we do not need to, since the problem we are after is slightly different—that of proving Zorn’s Lemma assuming the Axiom of Choice!

Let us keep the above procedure in mind in trying to prove Zorn’s lemma. We are given a partially ordered set {(A,\leq)} each of whose chains has an upper bound. And we suppose, towards a contradiction, that there is no maximal element, and use this to show that there is indeed a chain of {A} with no upper bound in {A}. The chain will be built essentially following the intuition in the above paragraph.

To begin with, notice that every chain {C} in {A} has some upper bound ({x}, say). Now {x} is not a maximal element of {A}, and hence there is some {y \in A} such that {y > x}. It follows that {y > z} for every {z \in C}. The Axiom of Choice allows us to pick one such {y}{u(C)}, say — for each chain {C\subseteq A}.

Now we can try to build our desired chain as follows: let {x_{0} = u(\emptyset)} (this is the same as {f(A)}!), {x_{1} = u(\{x_{0}\})}, {x_{2} = u(\{x_{1}, x_{2}\})}, and so on. The {x_{i}}‘s form a chain {C}, but there is an element {u(C)} that is strictly greater than all the {x_{i}}‘s. So we add {x_{\omega} = u(C)} to the end of {C} to get a larger chain. We can repeat this procedure “forever” (even taking into account “jumps to the next limit point”) to get a chain without an upper bound. This is the idea, but let’s do it a little more formally.

Notice that any chain {C \subseteq A} is an initial segment of this chain that we are hoping to build if and only if {C} is well-ordered by {<} and it has the following property:

\displaystyle  \forall x \in C: x = u(\{y \in C \mid y < x\}). \ \ \ \ \ (1)

Consider the collection {\mathcal{C}} of all chains {C} that are well-ordered by {<} and that satisfy condition (1). (Intuitively, {\mathcal{C}} is just the collection of initial segments of “our big chain”, but we will have to get at it in a slightly roundabout manner.)

Proposition 1 For distinct {C}, {D} in {\mathcal{C}}, either {C} is an initial segment of {D} or {D} is an initial segment of {C}.

Proof: Let us first prove that either {C \subseteq D} or {D \subseteq C}. Suppose neither holds. Then consider the least {x} in {C \setminus D}, and the least {y \in D \setminus C}. Thus {C' = \{z \in C \mid z < x\} \subseteq D} and {D' = \{z \in D \mid z < y\} \subseteq C}. Notice that {x = u(C')} and {y = u(D')}. Now, since {D} is a total order, either {y} is greater than every element of {C'}, or it is less than or equal to some element of {C'} and thus less than {x}.

Suppose the former. We claim that {C'} is an initial segment of {D}, for if not, there exist {d < d'} in {D} such that {d' \in C'} and {d \not\in C'}. But this is a contradiction, since {d < d' < y} and {y} was supposed to be the least element of {D} not in {C}, so {d \in C'} after all! But now we have {\{z \in D \mid z < x\} = C'}, and {D} has an element {y} greater than everything in {C'}, and thus the least such, {x = u(C')}, belongs to {D}, a contradiction!

Suppose {y < x}. Then by arguing just as above, we can show that {y} belongs to {C}, a contradiction!

Thus either {C \subseteq D} or {D \subseteq C}. Suppose, without loss of generality, that {C \subseteq D}. Since {C} and {D} are distinct, there exists a least {y \in D} that is not in {C}. We claim that {C} is an initial segment of {D}. Suppose not. Then there is a least {x \in D} such that {x < y} and {x \not\in C}. But this means that for all {z < x}, {z \in C} if and only if {z \in D}. But then {x = u(\{z \in D \mid z < x\}) = u(\{z \in C \mid z < x\}) \in C}, a contradiction. Thus {C} is an initial segment of {D}. \Box

Proposition 2 If {C} is an element of {\mathcal{C}}, {C \cup \{u(C)\}} is also an element of {\mathcal{C}}.

Proof: Of course {u(C)} will be the largest element in {C \cup \{u(C)\}}, and it is easy to see that {<} is a well-ordering of this set, and that (1) is satisfied by {C \cup \{u(C)\}}. \Box

Proposition 3 {\bigcup\mathcal{C}} is an element of {\mathcal{C}}.

Proof: Let {D = \bigcup\mathcal{C} = \{x \mid \exists C \in \mathcal{C}} such that {x \in C\}}.

We first show that {<} well-orders {D}. Suppose there is an infinite descending chain {x_{0} > x_{1} > \cdots} in {D}. Let {C_{i} \in \mathcal{C}} such that {x_{i} \in C_{i}}, for each {i}. Now it is easy to prove that for all {i}, {x_{i} \in C_{0}}. It is trivially true for {i = 0}. Suppose {x_{j} \in C_{0}} and consider {x_{j+1} \in C_{j+1}}. Now if {C_{0}} is an initial segment of {C_{j+1}}, it follows that {x_{j+1} \in C_{0}} (from {x_{j} \in C_{0}}, {x_{j} > x_{j+1}} and {x_{j+1} \in C_{j+1}}). If {C_{j+1}} is an initial segment of {C_{0}} then {C_{j+1} \subseteq C_{0}} and hence {x_{j+1} \in C_{0}}. But {C_{0} \in \mathcal{C}} and hence {<} is a well-ordering of it, so there cannot be an infinite descending sequence in {C_{0}}, and the same holds for {D}.

We now show that every {C \in \mathcal{C}} is an initial segment of {D}. Clearly {C} is a subset of {D}. We need to show that for every {x,y \in D} such that {x \in C} and {y < x}, {y \in C}. Let {C'} be some element of {\mathcal{C}} such that {y \in C'}. Now either {C'} is an initial segment of {C}, in which case {y \in C} (since {C' \subseteq C}), or {C} is an initial segment of {C'}, in which case also {y \in C} (since {x \in C \cap C'} and {y \in C'} such that {y < x}).

Now suppose that {x \in D}. Let {C \in \mathcal{C}} such that {x \in C}. As we have seen many times now, {\{z \in C \mid z < x\} = \{z \in D \mid z < x\}} and since {C} satisfies (1), it follows that {x = u(\{z \in D \mid z < x\})}.

Thus {\bigcup\mathcal{C} \in \mathcal{C}}. \Box

Now we are almost done. We claim that {D} above has to lack an upper bound in {A}, contrary to the assumption that every chain has an upper bound. Because if {D} did have an upper bound, then {u(D)} would be an element greater than everything in {D}. But then {D \cup \{u(D)\}} would be a chain in {\mathcal{C}}. But this leads to an immediate contradiction, because this means that {u(D) \in \bigcup\mathcal{C} = D}, but {u(D)} was supposed to be greater than every element of {D}!

The upshot of all this is that for every partially ordered set {A}, if every chain {C \subseteq A} has an upper bound in {A}, then {A} has a maximal element. Thus the Axiom of Choice implies Zorn’s Lemma.

Update: Many thanks to Shreevatsa, as usual, for pointing out a bug in Proposition 1, which has now hopefully been fixed!



  1. Thanks, read this finally!

    Some questions, probably stupid:
    * pick {x_{0}=f(A)} as the first element, {x_{1} = f(A\setminus \{x_{0}\})} as the second…
    Isn’t it a different ‘f’ that you use the second time? The original ‘f’ is only defined on the original set of sets.

    * As there can be several functions u, is the property (1) well-defined? (I guess you first use AC to get a specific upper bound for all possible chains, then use AC again to fix a specific function on all sets G_x = { y : y > x} — is this right?)

    * Suppose, without loss of generality, that {x < y}.
    Why should x and y be comparable?

    1. Thanks Shreevatsa for reading it! You are proving to be too faithful a reader of the blog – I’m honored!

      Not silly questions at all! Imprecision on my part – to aid readability, as they claim! First, about u: the assumptions tell you that for every chain C, the set \{y \in A \mid (\forall x \in C) y > x\} is nonempty. Now use AC once to pick a member of each of these sets, or in other words, for each chain C an element u(C) that is greater than everything in C.

      About picking different elements x_0, x_1, \ldots, no, it is not different f‘s! Fix some choice function f, and then do all this. The following question makes sense, though. What is a wellordering of A but a function from the class of all ordinals to the A? So we are building a function, g say, by transfinite recursion — g(\alpha) = f(A\setminus\{g(\xi) \mid \xi < \alpha\}). How do we know that such definitions by recursion are justified? They are, in fact! But setting up the machinery for that is not necessary to prove this theorem.

      Coming to your last question, I agree I have some major fixes to do! This is yet another of those numerous occasions when I was fooled by a picture I had in mind! Expect an update to the post soon!

      Thanks for the patient read and for pointing out the slip-ups!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s