WARNING: the following contains a whole lot of pedantry about proving theorems at a level of detail such that you could likely convince a computer proof assistant of their correctness. I teach a course where students learn to write such proofs without computer assistance, because doing so can profoundly affect how one perceives the possibilities for formal definition and proof. 1 Sometimes the default reasoning principles provided by a proof assistant can restrict one’s perspective of what reasoning principles are acceptable, how and why they might be judged acceptable, and how to devise and validate new ones.
More relevant to the present post is that even the literature and pedagogy of writing paper proofs is rife with defaults. I spend a lot of time teaching about proof by induction, and I find that my students, most of whom have learned something about proving stuff in the past, often arrive constrained by a very regimented approach to induction, involving mandatory “base cases” and “induction cases” and these nebulous “induction hypotheses” which seem to descend from the sky when invoked by arcane ceremony. I have some strong feelings about this, mostly due to my own struggles with gaining comfort with induction as a student, so I try to help students evade those struggles.
Taken at face value, this post is about some other proof principle, proof by infinite descent, but at its heart it’s really about induction. Ideally by the end of this post, I will have challenged, but also shed light on, the nature of inductive proof cases and induction hypotheses. As a bonus, you might see how induction can be used to explain another proof principle that intuitively seems valid, but for which it may not be obvious how to justify it formally.
Two Proofs That is Not Rational
In an excellent blog
post about the difference between proof of negation and proof by
contradiction, Andrej Bauer gives as an example a proof that
Theorem:
is not rational.
Proof. Supposewere equal to a fraction with and relatively prime. Then we would get , hence is even and so is . Write and plug it back in to get , from which we conclude that is even as well. This is a contradiction since and were assumed to be relatively prime. QED.
This is a proof of the negation of some proposition, i.e., that assuming the truth of the proposition implies a contradiction. Let’s consider some of its details.
First, the proof assumes that
Next, the proof uses the fact that every rational number can be
represented by a fraction
From these assumptions and facts, the proof deduces that both
Now to drive the topic of this post, here’s a more annoying proof of the same fact:
Proof. Suppose
were equal to a fraction . Then can’t be because . Then we would get , hence is even and so is . Write and plug it back in to get , from which we conclude that is even as well. Write and plug it back in to get . This yields a contradiction because we could divide our integers by forever, which is impossible. QED.
On the upside, this proof gets to omit the assumption that
While Andrej’s proof is a direct proof of negation, this annoying one is an example of proof by infinite descent, which according to Wikipedia is a particular kind of proof by contradiction, and relies on something called the the well-ordering principle. By the end of this post, I hope to convince you that: a) it is not a kind of proof by contradiction: rather it’s a technique for proving negation, so this claim deserves some finger-wagging from Andrej. b) it does not rely on the well-ordering principle, and good thing that because if it did, then the annoying proof would be broken, but it’s not.
We should be able to ascertain from a formal statement of the principle that it is simply a proof of negation. Andrej’s post elaborates on the source of confusion. And what is the problem with the claim that proof by infinite descent relies on well-ordering, whatever that is? To explain, let’s first pinpoint a more fundamental and general property, called well-foundedness, and reasoning principle, well-founded induction, that the principle of infinite descent can be based upon and even generalized, and use that to explain the confusion. Along the way, we’ll produce a different explanation of proof by infinite descent in terms of induction: an explanation that suits my induction-happy brain better. Also, we’ll get the opportunity to investigate induction itself: in particular, I want to demonstrate that the common “base case/induction case” recipe for induction is a useful but limiting view of how induction can be applied.
Well-foundedness and Well-ordering
As you might guess from the above two proofs, they depend on some
notion of “getting smaller”, and in the case of natural numbers the
default notion is the less-than relation
Since “well-ordering” and “well-founded” both start with “well-”, you
might suspect that there’s a relationship between them, and you’d be
right. Well-orderedness is described a variety of ways in the
literature, but for our purposes the following is the most relevant
formulation: a well-founded relation
To tie up a loose terminological end, consider the “-ordering” and
“-founded” suffixes. Well-orderings must be transitive: if
Proof by Infinite Descent, Generalized
Now that we have some grasp of well-foundedness, and why well-orderedness is a red herring, Let’s use it to state the Principle of Proof by Infinite Descent precisely and generally.
Theorem: Principle of Infinite Descent
Letbe a set and a well-founded relation on . Furthermore let be some property of elements of . Then if for every in , implies for some , it follows that does not hold for any in . Stated in formal logical notation:
The premise of the principle formalizes the idea that “no matter where you are (which x), you can still make a ’descending’ step.”, and the conclusion is that “well, if so then your property can’t apply to any x: it leads to contradiction.” Arguably a better name for the principle would be Proof by No Infinite Descent (thanks to Andrej for this observation!).
Our proof that
The intuition for the principle is pretty straightforward: if the
premises hold, then in principle we can build an infinite descending
But let’s buttress this intuition with rigour. We can prove this principle as a consequence of:
The Principle of Well-founded Induction: Let
be a set and a well-founded relation on . Then let be some property of elements of . Then if for every in , holding for every suffices to prove , it follows that holds for every in . Stated in formal logical notation:
This principle is quite abstract, since it’s stated in terms of a
general principle. To clarify, let’s instantiate it with particular
well-founded relations that lead to principles with which we are
familiar. We already observed that
The general structure of a proof by course-of-values induction is to
assume some natural number
Though students have often seen strong mathematical induction, they
are more familiar with plain ole’ mathematical induction, where the
“induction hypothesis” is to assume that a property holds for a natural
number one less than the current one. We can get this by using the
well-founded relation
Advertisement: I first learned about this perspective on induction from Glynn Winskel’s excellent textbook, The Formal Semantics of Programming Languages. I highly recommend it.
What even is a “base case”?
A common structure used in proofs that proceed by either mathematical
induction or course-of-values induction, or at least in teaching such
proofs, is to assume
This template for applying the proof by course-of-values induction is so common that often the split into base case vs. inductive case, where the base case does not appeal to induction, gets baked right into the explanation of the proof principle, as if it were part of the approach. I find this categorical specialization to be unfortunate for two reasons.
The first reason is that sometimes a proof needs a case split, but this particular case split is the wrong one: in that it does not help push the proof through. My favourite demonstration of this phenomenon is a proof that any two positive natural numbers have a greatest common divisor. This proof could be extended to integers, and thus apply to Andrej’s proof as discussed above, but doing so would deviate more from my main point than I would like. The proof involves a helper lemma followed by a main theorem.
Lemma. For all positive natural numbers
and for all positive natural numbers , if then there exists a positive natural number such that for some positive natural and for some positive natural , and furthermore if is a positive natural number that satisfies these properties, then .
This lemma can be proven by course-of-values induction. Then the
statement of the main theorem has the same structure, but does not
assume
I’m only going to give a sketch of the lemma’s proof, since it
suffices to make the relevant point. It proceeds by course-of-values
induction on
The interesting part of the above proof is that two cases are
considered, and only one of them appeals to the induction hypothesis.
But the case that does not is not an
For exactness sake, it’s worth pointing out that one can perfectly
well devise a principle of course-of-values induction that applies to
just the positive integers, such that
So the lesson to be drawn from my first gripe is that when performing induction, if we reconceive of “base cases” to mean those cases that do not appeal to an induction hypothesis, then such cases need not coincide with the cases that involve the minimal elements of the well-founded relation (or the unique minimum element of a well-ordered relation). For many proofs the default heuristic helps, but that default is not a universal law of induction.
No base case? no problem!
To reiterate, my first problem with how course-of-values induction is taught is that the default base-case/inductive-case split may not be the most fruitful one. The second problem I have is that you may not even need such a split. Usually it is assumed that there must be some “base case”, that does not get to exploit the induction hypothesis, for how could it ever apply? But in some proofs you can always exploit the induction hypothesis. This is exactly the case in the following proof of the Principle of Infinite Descent that uses the Principle of Well-founded Induction.
Theorem: Principle of Infinite Descent
Proof. Suppose
This proof of proof by infinite descent used course of values
induction, but did not need to consider distinct cases related to
Part way back around, without proof by contradiction
The following is a bonus: thanks to Andrej who made the observation that inspired it.
We’ve proven that if
Ok on to the proof. First, let’s get precise about descending chains.
Define a descending chain of a binary relation
Now, we can prove that if
Proof. Suppose
Let’s prove the sufficient condition: suppose
Coda
According to John Stillwell’s Elements of Algebra (Section 2.10), Blaise Pascal was “the first to use induction in the ’base case, induction step’ format” in a paper from 1654. But our modern understanding of induction has it’s origins in Richard Dedekind’s seminal 1888 book “Was sind und was sollen die Zahlen?” (sometimes translated “What are the numbers and what are they for?”). Many defaults—be they notation, techniques, and explanations—in mathematics have historical origins that precede a more sophisticated understanding of phenomena, and sometimes an impedance mismatch between the historical and the modern can limit our understanding (treatment of variables and functions by mathematical texts is one of my standard whipping horses, but that’s a topic for another day). Those perspectives persist because they were useful, and are still useful today, but it also helps to consider how they might best be incorporated with those refined conceptions we’ve gained since their inception.
As for the use of proof by infinite descent versus a proof like Andrej’s that argues directly from minimality, one could reasonably argue that explicitly calling out the existence of a minimal case is more intuitively graspable for some arguments. It’s not clear to me whether arguing from the perspective of “there must be one or more minimal cases” versus “you can’t go down this way forever” is more generally clear, especially once you have come to understand proof by infinite descent. Furthermore, in the case of general well-foundedness, there may not be a single smallest case, but rather a (possibly) infinite number of minimal cases, in which case calling all of them out abstractly may be less compelling than calling out a singular minimum as was the case (pun intended) for the natural numbers. Which argument is most enlightening probably depends on personal taste.
Acknowledgments
Many thanks to Andrej Bauer for feedback on this post! Thanks to
folks on Hacker News who noted that the annoying proof did not address
No comments:
Post a Comment