Mathematics

More music and other things!

How much can I torture my dear readers? Well, seeing as there are four and a half of them, I will not be increasing the overall pain in the world by much! Today’s post is intended to serve only one purpose – to record for myself some things that I want to remember! And so, dear reader, if you’re not down with it, we’ve got two words for you! – actually three – bear with me!

First up, probably the best concert I attended this season – my sixth Vijay Siva! Marvelous it was! With R K Sriramkumar on the violin, Tiruchi Sankaran Sir on the mridangam, and B S Purushothaman on the kanjira. He seems to be at the peak of his musical form! The concert was total sowkhyam throughout, everything kept simple and pure, with no gimmickry at all! And what choice of ragas! Hard to see so many weighty ragas packed into one concert!

  1. vallabhanāyakasya – begaḍa.
  2. saṅgītaśāstrajñānamu – mukhāri. Nice niraval, swaram.
  3. ambikāyām abhayāmbikāyām – kedāram. Short alapana, and niraval in the kriti.
  4. mīnalocana brova – dhanyāśi. I thought this was the best piece of the concert. Beautiful alapana, but the hightlight was the niraval, swaram in kāmapālinī.
  5. jānakīpate – kharaharapriya. Imagine singing kharaharapriya for a filler between two main pieces!
  6. śrisubrahmaṇyāya namaste – kāmbhoji. Majestic! The main piece, with an ālāpanā that is still fresh in memory, elaorate niraval and swarams.
  7. tani āvartanam – tisra ekam. I don’t know enough to say anything knowledgeable about Tiruchi Sankaran Sir’s playing except that it was scintillating, with B S Purushothaman ably matching him.
  8. rāgam tānam pallavi – ṣaṇmukhapriya – tisra tripuṭa (2). Not much time could be spent on the ālāpanā, but the pallavi had all the standard elements, singing in four tempos, etc.
  9. ini enna pechu – sahānā
  10. viruttam – yamunākalyāṇi, maṇiraṅgu, jañjūṭi
  11. tiruppugazh – jañjūṭi
  12. harivāsarada – sindhubhairavi
  13. maṅgalam

The other thing I want to record is something I fought hard to figure out more than once, and promptly forgot every time! After the latest episode (three or four days back) I decided to record it somewhere. (And a nice place I have chosen! Let me see if I find this the next time I am stuck on the same problem!) It has to do with a Lemma in Kenneth Kunen’s Set Theory that is dismissed with a one line proof! The claim has to do with relativizations. (The relativization of a formula \varphi with respect to another formula M(x) is got by replacing all quantifiers \forall{x}\alpha in \varphi by \forall{x}(M(x) \supset \alpha) and by replacing all quantifiers \exists{x}\alpha in \varphi by \exists{x}(M(x) \wedge \alpha). )

The claim is that if M(x) is a formula with one free variable, and \varphi_1, \ldots, \varphi_n, \psi are sentences such that \varphi_1, \ldots, \varphi_n \vdash \psi (this notation stands for: there is a derivation (using the axioms and rules for first-order logic) of \psi from the \varphi_i as assumptions), then \exists{x}M(x), \varphi^M_1, \ldots, \varphi^M_n \vdash \psi^M (here \varphi^M_1 etc. means the relativization of the formulas with respect to M(x)).

I was trying to transform a derivation of the former to a derivation of the latter. It didn’t work! Then I realised that one has to work with derivations that consist only of sentences. Whenever the assumptions and conclusion are sentences, one can manage things so that only sentences are used in the derivation, no matter what sound and complete axiom system for first order logic one uses. Then it’s a routine matter to check that the relativization of each axiom holds and that the relativized versions of the rules are admissible. But … even after many attempts I couldn’t pin down exactly where the extra assumption \exists{x}M(x) was being used in the second derivation. Finally I did find it, and that is what I want to record here!

The troublesome axiom is this: \forall{x}\alpha(x) \supset \alpha[x:=y]. But since our original proof consists only of sentences, the axiom that actually gets used is a generalization, \forall{y}(\forall{x}\alpha(x) \supset \alpha[x:=y]). Now the relativized version of this is \forall{y}(M(y) \supset ((\forall{x}(M(x) \supset \alpha(x)) \supset \alpha[x:=y])). One can check easily that this is a validity, and hence provable (or construct a derivation directly!).

So where is the extra non-emptiness assumption \exists{x}M(x) used? Precisely in proving the relativization of the above axiom, once we realize our error! The point is that x might not occur free in \alpha, and so \alpha[x:=y] might just be \alpha. In this case, our original derivation might have just used the axiom \forall{x}\alpha \supset \alpha. The relativized version is \forall{x}(M(x)  \supset \alpha) \supset \alpha. And this is not necessarily provable, unless we also assume “nonemptiness” of M! The antecedent says that for every x, either \neg{}M(x) holds or \alpha holds. Now if it is the case that for every x, \neg{}M(x) holds, then it is not necessary for \alpha to hold. Precisely this is ruled out by the assumption \exists{x}M(x).

Now I can go back to doing some actual work, after having disposed of this irritating gap in my understanding!

Update In my focus on the Kunen lemma, I forgot to mention another highlight of the concert. Padma and her father were there for this, too! And it was great company!

Axiom of Choice and Zorn’s Lemma

The Axiom of Choice (AC) asserts that for all sets {\mathcal{A}} of nonempty sets, there is a choice function {f:\mathcal{A} \rightarrow \bigcup\mathcal{A}} such that for all {B \in \mathcal{A}}, {f(B) \in B}.

Zorn’s lemma says that for every partially ordered set {(A, \leq)} such that every chain in {A} has an upper bound in {A}, there is a maximal element of {A}.

The primary aim is to prove that the Axiom of Choice implies Zorn’s Lemma. For this we use the notion of wellordered sets, which are just linearly ordered sets {A} such that every nonempty subset of {A} has a least element.

To gain some intuition about the proof, let us instead try to define a wellordering of a given set, assuming the Axiom of Choice. For a given set {A}, consider a choice function for the set of all nonempty subsets of {A}. Then one could try to define a wellordering of {A} as follows: pick {x_{0}=f(A)} as the first element, {x_{1} = f(A\setminus \{x_{0}\})} as the second, {x_{2} = f(A\setminus \{x_{0},x_{1}\})} as the third, and so on. But all the {x_{i}}‘s might not exhaust all of {A}, so one might have to extend this process of building a wellorder by choosing the element following all the {x_{i}}‘s to be {x_{\omega} = f(A \setminus \{x_{0}, x_{1}, \ldots\})}. Then we continue with {x_{\omega+1}, x_{\omega+2}}, and so on. But even this might not exhaust all of {A} and we have to “jump to another limit point” and so on, to form the desired wellorder by a transfinite procedure. But we do not wish to build the machinery (ordinals) necessary to describe such a procedure. And indeed we do not need to, since the problem we are after is slightly different—that of proving Zorn’s Lemma assuming the Axiom of Choice!

Let us keep the above procedure in mind in trying to prove Zorn’s lemma. We are given a partially ordered set {(A,\leq)} each of whose chains has an upper bound. And we suppose, towards a contradiction, that there is no maximal element, and use this to show that there is indeed a chain of {A} with no upper bound in {A}. The chain will be built essentially following the intuition in the above paragraph.

To begin with, notice that every chain {C} in {A} has some upper bound ({x}, say). Now {x} is not a maximal element of {A}, and hence there is some {y \in A} such that {y > x}. It follows that {y > z} for every {z \in C}. The Axiom of Choice allows us to pick one such {y}{u(C)}, say — for each chain {C\subseteq A}.

Now we can try to build our desired chain as follows: let {x_{0} = u(\emptyset)} (this is the same as {f(A)}!), {x_{1} = u(\{x_{0}\})}, {x_{2} = u(\{x_{1}, x_{2}\})}, and so on. The {x_{i}}‘s form a chain {C}, but there is an element {u(C)} that is strictly greater than all the {x_{i}}‘s. So we add {x_{\omega} = u(C)} to the end of {C} to get a larger chain. We can repeat this procedure “forever” (even taking into account “jumps to the next limit point”) to get a chain without an upper bound. This is the idea, but let’s do it a little more formally.

Notice that any chain {C \subseteq A} is an initial segment of this chain that we are hoping to build if and only if {C} is well-ordered by {<} and it has the following property:

\displaystyle  \forall x \in C: x = u(\{y \in C \mid y < x\}). \ \ \ \ \ (1)

Consider the collection {\mathcal{C}} of all chains {C} that are well-ordered by {<} and that satisfy condition (1). (Intuitively, {\mathcal{C}} is just the collection of initial segments of “our big chain”, but we will have to get at it in a slightly roundabout manner.)

Proposition 1 For distinct {C}, {D} in {\mathcal{C}}, either {C} is an initial segment of {D} or {D} is an initial segment of {C}.

Proof: Let us first prove that either {C \subseteq D} or {D \subseteq C}. Suppose neither holds. Then consider the least {x} in {C \setminus D}, and the least {y \in D \setminus C}. Thus {C' = \{z \in C \mid z < x\} \subseteq D} and {D' = \{z \in D \mid z < y\} \subseteq C}. Notice that {x = u(C')} and {y = u(D')}. Now, since {D} is a total order, either {y} is greater than every element of {C'}, or it is less than or equal to some element of {C'} and thus less than {x}.

Suppose the former. We claim that {C'} is an initial segment of {D}, for if not, there exist {d < d'} in {D} such that {d' \in C'} and {d \not\in C'}. But this is a contradiction, since {d < d' < y} and {y} was supposed to be the least element of {D} not in {C}, so {d \in C'} after all! But now we have {\{z \in D \mid z < x\} = C'}, and {D} has an element {y} greater than everything in {C'}, and thus the least such, {x = u(C')}, belongs to {D}, a contradiction!

Suppose {y < x}. Then by arguing just as above, we can show that {y} belongs to {C}, a contradiction!

Thus either {C \subseteq D} or {D \subseteq C}. Suppose, without loss of generality, that {C \subseteq D}. Since {C} and {D} are distinct, there exists a least {y \in D} that is not in {C}. We claim that {C} is an initial segment of {D}. Suppose not. Then there is a least {x \in D} such that {x < y} and {x \not\in C}. But this means that for all {z < x}, {z \in C} if and only if {z \in D}. But then {x = u(\{z \in D \mid z < x\}) = u(\{z \in C \mid z < x\}) \in C}, a contradiction. Thus {C} is an initial segment of {D}. \Box

Proposition 2 If {C} is an element of {\mathcal{C}}, {C \cup \{u(C)\}} is also an element of {\mathcal{C}}.

Proof: Of course {u(C)} will be the largest element in {C \cup \{u(C)\}}, and it is easy to see that {<} is a well-ordering of this set, and that (1) is satisfied by {C \cup \{u(C)\}}. \Box

Proposition 3 {\bigcup\mathcal{C}} is an element of {\mathcal{C}}.

Proof: Let {D = \bigcup\mathcal{C} = \{x \mid \exists C \in \mathcal{C}} such that {x \in C\}}.

We first show that {<} well-orders {D}. Suppose there is an infinite descending chain {x_{0} > x_{1} > \cdots} in {D}. Let {C_{i} \in \mathcal{C}} such that {x_{i} \in C_{i}}, for each {i}. Now it is easy to prove that for all {i}, {x_{i} \in C_{0}}. It is trivially true for {i = 0}. Suppose {x_{j} \in C_{0}} and consider {x_{j+1} \in C_{j+1}}. Now if {C_{0}} is an initial segment of {C_{j+1}}, it follows that {x_{j+1} \in C_{0}} (from {x_{j} \in C_{0}}, {x_{j} > x_{j+1}} and {x_{j+1} \in C_{j+1}}). If {C_{j+1}} is an initial segment of {C_{0}} then {C_{j+1} \subseteq C_{0}} and hence {x_{j+1} \in C_{0}}. But {C_{0} \in \mathcal{C}} and hence {<} is a well-ordering of it, so there cannot be an infinite descending sequence in {C_{0}}, and the same holds for {D}.

We now show that every {C \in \mathcal{C}} is an initial segment of {D}. Clearly {C} is a subset of {D}. We need to show that for every {x,y \in D} such that {x \in C} and {y < x}, {y \in C}. Let {C'} be some element of {\mathcal{C}} such that {y \in C'}. Now either {C'} is an initial segment of {C}, in which case {y \in C} (since {C' \subseteq C}), or {C} is an initial segment of {C'}, in which case also {y \in C} (since {x \in C \cap C'} and {y \in C'} such that {y < x}).

Now suppose that {x \in D}. Let {C \in \mathcal{C}} such that {x \in C}. As we have seen many times now, {\{z \in C \mid z < x\} = \{z \in D \mid z < x\}} and since {C} satisfies (1), it follows that {x = u(\{z \in D \mid z < x\})}.

Thus {\bigcup\mathcal{C} \in \mathcal{C}}. \Box

Now we are almost done. We claim that {D} above has to lack an upper bound in {A}, contrary to the assumption that every chain has an upper bound. Because if {D} did have an upper bound, then {u(D)} would be an element greater than everything in {D}. But then {D \cup \{u(D)\}} would be a chain in {\mathcal{C}}. But this leads to an immediate contradiction, because this means that {u(D) \in \bigcup\mathcal{C} = D}, but {u(D)} was supposed to be greater than every element of {D}!

The upshot of all this is that for every partially ordered set {A}, if every chain {C \subseteq A} has an upper bound in {A}, then {A} has a maximal element. Thus the Axiom of Choice implies Zorn’s Lemma.


Update: Many thanks to Shreevatsa, as usual, for pointing out a bug in Proposition 1, which has now hopefully been fixed!

A math problem …

Yesterday, I had a chance to sit through a problem solving session where Abhishek Dang and Ashwin Deopurkar (final year BScM students at CMI) solved some 2009 IMO problems for the benefit of some school students. According to them, the problems were “actually very easy”. In fact, I was surprised to find that I was able to understand the solutions quite well, but that’s more a testament to their impressive skill in exposition than anything else.

Anyway, here is a nice problem. I am writing this (without permission from either of them!) so that I can refer back to it when I want to, and since otherwise I will forget it completely! It will stay here as long as there is a take-down notice from Abhishek! :-) At which point, Shreevatsa will teach me how to make this post private, I am sure!


Determine all functions f from \mathbb{N}^+ to \mathbb{N}^+ such that for all a and b, the numbers a, f(b) and f(b+f(a)-1) form the three sides of a non-degenerate triangle.

Here is the solution as described by Abhishek.

Take a = 1 and let f(1) = r. Then we know that for all b: 1, f(b), and f(b+r-1) form the sides of a non-degenerate triangle. In particular, this means that f(b) + 1 > f(b+r-1), i.e. f(b) \geq f(b+r-1) and f(b+r-1)+1 > f(b), i.e. f(b+r-1) \geq f(b). Thus \forall b: f(b) = f(b+r-1). But notice that this says that either r = 1 or f is a periodic function. But if f is a periodic function, it is bounded. But then we can choose a large enough (say more than twice the largest value that f can take) so that there is no b for which a, f(b), f(b+f(a) - 1) form a triangle. Thus, it has to be the case that r = 1.

Now we have determined that f(1) = 1. Let’s play around a little more to gain more information about f. We could try to set b = 1 and see what we can learn. Well, in that case f(b) = 1 and thus we conclude that for all a: 1, a, f(f(a)) form the sides of a nondegenerate triangle. Arguing as before, we see that for all a: a = f(f(a)). But this easily implies that f is a bijection from the positive integers to itself.

What more? Try a = 2. Let f(2) = r. We know that r > 1 since f(1) = 1 and f is a bijection. Let r-1 = d. Now we see that for all b: 2, f(b), f(b+d) form the sides of a nondegenerate triangle. But this means that for all b: f(b) - 1 \leq f(b+d) \leq f(b)+1. In particular, since f is a bijection (and d \neq 0), we know that for all b, either f(b+d) = f(b)-1 or f(b+d) = f(b)+1.

Now consider the value of f on 1, 1+d, 1+2d, \ldots. It is an easy proof by induction (using the fact that f is injective) that f(1+(i-1)d) = i. But this would mean that f is a bijection between \{1+(i-1)d \mid i \in \mathbb{N}^+\} and \mathbb{N}^+, and this can happen only if d = 1. And it follows that f is the identity!

Nice, wasn’t it?

An odd party!

Nice puzzle that Neeldhara posed to me, and walked me through the solution of! Let me record it, for your enjoyment!

You are at a party where any two people have an odd number of mutual friends at the party. You see an even number of people other than you. But still …

pratyakṣam parikalitamapyartham anumānena bubhutsante tarkarasikāḥ

The lovers of logic wish to establish through inference the same objects revealed by direct perception!

So you wish to deduce that there are an odd number of attendees—the number of odd attendees, on the other hand, while definitely at least one, cannot be ruled out from being even!

The rest of the post shows one way you might deduce this. One piece of bureaucracy first: for an arbitrary party and an arbitrary attendee, by friend we mean a friend who is also an attendee.

The basic tool is the handshake lemma, which says that in any party, the number of attendees with an odd number of friends (among the attendees) is even. An immediate consequence is that if in a party, every attendee has an odd number of friends, then the number of attendees is even.

Now Devadatta is one of the attendees, and his friends want to gift him something (today being his birthday). But they haven’t decided on any yet. So they all repair to a corner to discuss this (out of his eyesight and earshot). If you consider this sub-party, how many friends does each person have? Why, an odd number, of course! (Since each friend (and non-friend!) of Devadatta shares an odd number of friends with him.) But now in this sub-party, every one has an odd number of friends, and so its strength is even (by the handshake lemma).

But the same argument holds good for Viṣṇumitra and Yajñadatta too, and hence everyone has an even number of friends. Now let us say that Devadatta’s friends have decided that there can be no better birthday gift for him than masala poli at Sri Krishna Sweets, and so drag him off (mid-party) to SKS. So an odd number of people have left the place (Devadatta and his even number of friends).

What about the rest? They each had an even number of friends in the party, and an odd number (the mutual friends they share with Devadatta) have now left. So we have a (reduced) party where everybody has an odd number of friends, and thus the strength is even (handshake lemma again!). So there, you see, the number of people originally in the party is odd!

Worse than a stupid joke …

Worse than a stupid joke is a stupid joke explained in detail. But anyway, what else did you expect from me? Here goes …

A set A is well-founded if it is either empty or \exists b \in A: A \cap b = \emptyset. A (probably more intuitive) characterization is that there does not exist an infinite sequence A_0, A_1, \ldots, not necessarily distinct, such that A_0 = A, and \forall i: A_{i+1} \in A_i. So a simple example of a non-wellfounded set is a set A which is equal to \{A\} (whatever such a set might be!). In fact, non-wellfounded objects are quite bizarre, and we do not have a need for them anyway, so one of the axioms of Zermelo-Frankael set theory is that every set is wellfounded. This is the axiom of regularity.

Anyway, another example of a non-wellfounded set is A = \{b,c\} such that A \in b and A \in c. Now you see what I meant in my last post? Shri Tiruchi Sankaran was in his elements!!