Monday 1 July 2024

A Proof of Proof by Infinite Descent

WARNING: the following contains a whole lot of pedantry about proving theorems at a level of detail such that you could likely convince a computer proof assistant of their correctness. I teach a course where students learn to write such proofs without computer assistance, because doing so can profoundly affect how one perceives the possibilities for formal definition and proof. 1 Sometimes the default reasoning principles provided by a proof assistant can restrict one’s perspective of what reasoning principles are acceptable, how and why they might be judged acceptable, and how to devise and validate new ones.

More relevant to the present post is that even the literature and pedagogy of writing paper proofs is rife with defaults. I spend a lot of time teaching about proof by induction, and I find that my students, most of whom have learned something about proving stuff in the past, often arrive constrained by a very regimented approach to induction, involving mandatory “base cases” and “induction cases” and these nebulous “induction hypotheses” which seem to descend from the sky when invoked by arcane ceremony. I have some strong feelings about this, mostly due to my own struggles with gaining comfort with induction as a student, so I try to help students evade those struggles.

Taken at face value, this post is about some other proof principle, proof by infinite descent, but at its heart it’s really about induction. Ideally by the end of this post, I will have challenged, but also shed light on, the nature of inductive proof cases and induction hypotheses. As a bonus, you might see how induction can be used to explain another proof principle that intuitively seems valid, but for which it may not be obvious how to justify it formally.

Two Proofs That \(\sqrt{2}\) is Not Rational

In an excellent blog post about the difference between proof of negation and proof by contradiction, Andrej Bauer gives as an example a proof that \(\sqrt{2}\) is not a rational number. The article is worth a read, but I would note in particular that in a technical sense, his proof is not a proof that it is an irrational number, just that it is not a rational number. I imagine that one could also prove that \(\sqrt{-1}\) is not rational, but that had better not count as a proof that it is irrational! However, if we also prove that \(\sqrt{2}\) is a real number, which could be done using Dedekind cuts for instance, then we can deduce that \(\sqrt{2}\) is indeed irrational. And to no one’s surprise, we can prove that \(\sqrt{-1}\) is not a real number.

Theorem:\(\sqrt{2}\) is not rational.
Proof. Suppose \(\sqrt{2}\) were equal to a fraction \(\frac{a}{b}\) with \(a\) and \(b\) relatively prime. Then we would get \(a^2 = 2b^2\), hence \(a^2\) is even and so is \(a\). Write \(a = 2c\) and plug it back in to get \(2c^2 = b^2\), from which we conclude that \(b\) is even as well. This is a contradiction since \(a\) and \(b\) were assumed to be relatively prime. QED.

This is a proof of the negation of some proposition, i.e., that assuming the truth of the proposition implies a contradiction. Let’s consider some of its details.

First, the proof assumes that \(\sqrt{2}\) is equal to a fraction \(\frac{a}{b}\), which implicitly implies that \(a\) and \(b\) are integers and that \(b\) is not 0. We could limit the two numbers to be naturals, by which I mean non-negative integers, applying the convention that \(\sqrt{2}\) denotes specifically the positive root, but since there exists a negative number that when squared equals 2, and since generalizing the claim plays into my overall story, let’s make no commitment to whether \(a\) and \(b\) are positive or negative.

Next, the proof uses the fact that every rational number can be represented by a fraction \(\frac{a}{b}\) such that \(a\) and \(b\) are relatively prime, in other words that no positive integer other than 1 divides them evenly, which is a precise way of saying that \(\frac{a}{b}\) is in reduced form. This is a fact that needs to be proved separately and prior to this proof. We could instead ntegrate this reasoning directly into the proof: omit the relative primality assumption, and then let \(a' = a/\gcd(a,b)\) and \(b' = b/\gcd(a,b)\) where \(gcd(a,b)\) is the greatest common divisor of both. It then follows from some proof effort that \(a'\) and \(b'\) are relatively prime and \(\frac{a'}{b'} = \frac{a}{b}\), thus completing the general proof. However, we’re still using an auxiliary fact that can be proven by induction: that \(a\) and \(b\) have a unique greatest common divisor. More on that below.

From these assumptions and facts, the proof deduces that both \(a\) and \(b\) are even, which means that \(2\) divides both of them, which contradicts the assumption that among positive integers only 1 can do this. So assuming that \(\sqrt{2}\) is rational leads to contradiction.

Now to drive the topic of this post, here’s a more annoying proof of the same fact:

Proof. Suppose \(\sqrt{2}\) were equal to a fraction \(\frac{a}{b}\). Then \(a\) can’t be \(0\) because \(\frac{0}{b}^2 \neq 2\). Then we would get \(a^2 = 2b^2\), hence \(a^2\) is even and so is \(a\). Write \(a = 2c\) and plug it back in to get \(2c^2 = b^2\), from which we conclude that \(b\) is even as well. Write \(b = 2d\) and plug it back in to get \(c^2 = 2d^2\). This yields a contradiction because we could divide our integers by \(2\) forever, which is impossible. QED.

On the upside, this proof gets to omit the assumption that \(a\) and \(b\) are relatively prime, which in turn means we don’t need to consult Wikipedia to remind ourselves what the heck “relatively prime” means anyway. But on the other hand...what just happened?!? Intuitively this makes sense, but is this legit?

While Andrej’s proof is a direct proof of negation, this annoying one is an example of proof by infinite descent, which according to Wikipedia is a particular kind of proof by contradiction, and relies on something called the the well-ordering principle. By the end of this post, I hope to convince you that: a) it is not a kind of proof by contradiction: rather it’s a technique for proving negation, so this claim deserves some finger-wagging from Andrej. b) it does not rely on the well-ordering principle, and good thing that because if it did, then the annoying proof would be broken, but it’s not.

We should be able to ascertain from a formal statement of the principle that it is simply a proof of negation. Andrej’s post elaborates on the source of confusion. And what is the problem with the claim that proof by infinite descent relies on well-ordering, whatever that is? To explain, let’s first pinpoint a more fundamental and general property, called well-foundedness, and reasoning principle, well-founded induction, that the principle of infinite descent can be based upon and even generalized, and use that to explain the confusion. Along the way, we’ll produce a different explanation of proof by infinite descent in terms of induction: an explanation that suits my induction-happy brain better. Also, we’ll get the opportunity to investigate induction itself: in particular, I want to demonstrate that the common “base case/induction case” recipe for induction is a useful but limiting view of how induction can be applied.

Well-foundedness and Well-ordering

As you might guess from the above two proofs, they depend on some notion of “getting smaller”, and in the case of natural numbers the default notion is the less-than relation \(<\), but less-than-ness is just an example of a more general phenomenon. So first, let’s isolate the property of \(<\) on natural numbers that really powers induction. A binary relation \(\sqsubset\) is called well-founded if all leftward chains of the form \(\dots \sqsubset x_3 \sqsubset x_2 \sqsubset x_1\) must be finite in length, so there cannot exist any infinite descending chains. The \(<\) relation on natural numbers has this property, because starting from any \(n\) and producing smaller numbers and stopping whenever you want, you will always have to stop, either necessarily at \(0\) or earlier by choice. On the other hand, \(<\) extended to integers does not have this property because you can go forever down the negative numbers, e.g., \(\dots -16 < -8 < -4 < -2\). Luckily, \(<\) isn’t the only binary relation on integers! Here’s a different one that is well-founded: if \(z_1,z_2 \in \mathbb{Z}\), say \(z_1 \mathbin{\mathbin{\lvert<\rvert}}z_2\) if and only if \(\lvert z_1\rvert < \lvert z_2\rvert\), where \(\lvert z\rvert\) denotes the absolute value of \(z\) and \(<\) is restricted to natural numbers. Then \(2 \mathbin{\mathbin{\lvert<\rvert}}3\) but also \(-2 \mathbin{\mathbin{\lvert<\rvert}}3\), \(2 \mathbin{\mathbin{\lvert<\rvert}}{-3}\), and \(-2 \mathbin{\mathbin{\lvert<\rvert}}-3\): the sign of the integer is ignored! Notice, though, that \(2\) and \(-2\) are not related by \(\mathbin{\mathbin{\lvert<\rvert}}\) in either direction. Since every leftward chain of the \(\mathbin{\mathbin{\lvert<\rvert}}\) relation works its way toward \(0\), the center of the number line, we can see that all such chains must be finite.

Since “well-ordering” and “well-founded” both start with “well-”, you might suspect that there’s a relationship between them, and you’d be right. Well-orderedness is described a variety of ways in the literature, but for our purposes the following is the most relevant formulation: a well-founded relation \(\sqsubset\) is a well-ordering if and only if it relates every distinct pair of elements, i.e. \(x_1 \neq x_2\) implies either \(x_1 \sqsubset x_2\) or \(x_2 \sqsubset x_1\). So based on what we’ve observed, \(<\) is well-ordered, but \(\mathbin{\mathbin{\lvert<\rvert}}\) is not (remember \(2\) and \(-2\)?).

To tie up a loose terminological end, consider the “-ordering” and “-founded” suffixes. Well-orderings must be transitive: if \(x_1 \sqsubset x_2\) and \(x_2 \sqsubset x_3\) then \(x_1 \sqsubset x_3\). The consequent must hold because if it didn’t, then totality would force \(x_3 \sqsubset x_1\), but that would introduce a three-element cycle \(\dots \sqsubset x_3 \sqsubset x_1 \sqsubset x_2 \sqsubset x_3\) and break well-foundedness. Then, given some well-ordering \(\sqsubset\), extending it to the relation \(\sqsubseteq\), which extends \(\sqsubset\) to be reflexive, is a total order, i.e., a partial order (reflexive, transitive, antisymmetric relation) that is also total. On the other hand, a well-founded relation does not have to be transitive, let alone total, so extending one in this manner is not even guaranteed to get you a partial order, let alone a total one. Hence the suffix “-founded”.

Proof by Infinite Descent, Generalized

Now that we have some grasp of well-foundedness, and why well-orderedness is a red herring, Let’s use it to state the Principle of Proof by Infinite Descent precisely and generally.

Theorem: Principle of Infinite Descent
Let \(X\) be a set and \(\sqsubset\) a well-founded relation on \(X\). Furthermore let \(\Phi(x)\) be some property of elements \(x\) of \(X\). Then if for every \(x\) in \(X\), \(\Phi(x)\) implies \(\Phi(y)\) for some \(y \sqsubset x\), it follows that \(\Phi(x)\) does not hold for any \(x\) in \(X\).

Stated in formal logical notation: \[(\forall x\in X.\, \Phi(x) \implies \exists y \in X.\, y \sqsubset x \land \Phi(y)) \implies \forall x\in X.\, \neg\Phi(x).\]

The premise of the principle formalizes the idea that “no matter where you are (which x), you can still make a ’descending’ step.”, and the conclusion is that “well, if so then your property can’t apply to any x: it leads to contradiction.” Arguably a better name for the principle would be Proof by No Infinite Descent (thanks to Andrej for this observation!).

Our proof that \(\sqrt{2}\) is not rational implicitly applies this principle: Let \(X\) be the integers \(\mathbb{Z}\); for integers \(a,b\), let \(a \sqsubset b\) be \(a \mathbin{\mathbin{\lvert<\rvert}}b\); and let the property \(\Phi(a)\) be “\(a^2 = 2b^2\) for some integer \(b\)” (formally, \(\exists b\in \mathbb{Z}.\,a^2 = 2b^2\)). Assuming \(a \neq 0\) and taking \(b\), the proof uses division by \(2\) to produce a \(c \mathbin{\mathbin{\lvert<\rvert}}a\) for which some \(d\) satisfies the equation. Having satisfied the premises of the principle, we use it to deduce that no \(a\) satisfies the property (though admittedly the prose proof shrouds this use of the principle in squishy “forever is impossible” terminology).

The intuition for the principle is pretty straightforward: if the premises hold, then in principle we can build an infinite descending \(\mathbin{\mathbin{\lvert<\rvert}}\)-sequence of integers that satisfy the property. But there are no such infinite sequences, because we assumed that all of them must be finite, so a contradiction must follow.

But let’s buttress this intuition with rigour. We can prove this principle as a consequence of:

The Principle of Well-founded Induction: Let \(X\) be a set and \(\sqsubset\) a well-founded relation on \(X\). Then let \(\Psi(x)\) be some property of elements \(x\) of \(X\). Then if for every \(x\) in \(X\), \(\Psi(y)\) holding for every \(y \sqsubset x\) suffices to prove \(\Psi(x)\), it follows that \(\Psi(x)\) holds for every \(x\) in \(X\).

Stated in formal logical notation: \[(\forall x\in X.\, (\forall y \in X.\, y \sqsubset x \implies \Psi(y)) \implies \Psi(x)) \implies \forall x\in X.\, \Psi(x).\]

This principle is quite abstract, since it’s stated in terms of a general principle. To clarify, let’s instantiate it with particular well-founded relations that lead to principles with which we are familiar. We already observed that \(<\) is a well-founded relation on natural numbers (because all well-orderings are also well-founded), we can specialize well-founded induction to get what is often called strong mathematical induction or course-of-values induction on natural numbers: \[(\forall n\in \mathbb{N}.\, ((\forall m \in \mathbb{N}.\, m < n \implies \Psi(m)) \implies \Psi(n)) \implies \forall n\in \mathbb{N}.\, \Psi(n).\]

The general structure of a proof by course-of-values induction is to assume some natural number \(n\), then prove that assuming that \(\Psi(m)\) holds for every number less than \(n\) suffices to prove \(\Psi(n)\). Achieving that suffices to prove that \(\Psi\) holds for every natural number \(n\). Observe that the assumption “\(\Psi(m)\) holds for every number less than \(n\)” is what gets called an “induction hypothesis”. In the common style of prose proofs, the “induction hypothesis” often gets evoked in the middle of an argument, and for a specific value, rather than explicitly assumed at the beginning of the argument in full generality. To me this always seemed to me to be some magical thing that got pulled out of the air in the middle of the proof, leaving me unsure and about what kind of thing induction even was. But in fact the induction hypothesis is just an assumption that you make as you attempt to prove \(\Psi(n)\), and its justification is hidden in the proof of the principle of well-founded induction. Surprise: you can prove induction principles correct, and devise and prove correct new ones: they need not be taken as axioms!

Though students have often seen strong mathematical induction, they are more familiar with plain ole’ mathematical induction, where the “induction hypothesis” is to assume that a property holds for a natural number one less than the current one. We can get this by using the well-founded relation \(n_1 <_1 n_2\) which holds if and only if \(n_2 = n_1 + 1\). This relation is neither total nor an order since it only ever relates abutting natural numbers. But, curiously enough, we can still plug it into the Principle of Proof by Infinite Descent if we want, and get something meaningful, albeit more restrictive, than using \(<\).

Advertisement: I first learned about this perspective on induction from Glynn Winskel’s excellent textbook, The Formal Semantics of Programming Languages. I highly recommend it.

What even is a “base case”?

A common structure used in proofs that proceed by either mathematical induction or course-of-values induction, or at least in teaching such proofs, is to assume \(n\), assume the induction hypothesis, and then proceed by cases on whether \(n = 0\) or whether \(n > 0\). The first case gets called the “base case” and is proven outright, and the second case gets called the “inductive case” which exploits “the induction hypothesis” to complete that case. When using induction more generally, as is common in PL theory, this notion of base cases and induction cases gets emphasized as well. However, it’s worth looking back at the principle of well-founded induction and observing that nothing about that theorem forces a particular case analysis, or restricts the availability of the premise that gets called “the induction hypothesis”. The funny thing is that the “base case” also has a perfectly fine induction hypothesis, but typically it is useless because most of the time the proof in question is not equipped to conjure an \(m < n\) with which to exploit it.

This template for applying the proof by course-of-values induction is so common that often the split into base case vs. inductive case, where the base case does not appeal to induction, gets baked right into the explanation of the proof principle, as if it were part of the approach. I find this categorical specialization to be unfortunate for two reasons.

The first reason is that sometimes a proof needs a case split, but this particular case split is the wrong one: in that it does not help push the proof through. My favourite demonstration of this phenomenon is a proof that any two positive natural numbers have a greatest common divisor. This proof could be extended to integers, and thus apply to Andrej’s proof as discussed above, but doing so would deviate more from my main point than I would like. The proof involves a helper lemma followed by a main theorem.

Lemma. For all positive natural numbers \(a\) and for all positive natural numbers \(b\), if \(a \geq b\) then there exists a positive natural number \(c\) such that \(xc = a\) for some positive natural \(x\) and \(yc = b\) for some positive natural \(y\), and furthermore if \(d\) is a positive natural number that satisfies these properties, then \(c \geq d\).

This lemma can be proven by course-of-values induction. Then the statement of the main theorem has the same structure, but does not assume \(a \geq b\). It can be proven by splitting cases on whether \(a \geq b\) or \(b \geq a\) and then applying the lemma with the two numbers appropriately ordered (either order will work if the numbers are equal).

I’m only going to give a sketch of the lemma’s proof, since it suffices to make the relevant point. It proceeds by course-of-values induction on \(a\) where the relevant property of a is the substatement : “for all positive natural numbers \(b\) ...” (now you know why I explicitly separated the two variable quantifications in the prose statement of the lemma). After assuming the induction hypothesis, assuming \(b\) and assuming \(a \geq b\), we proceed by cases on whether \(a = b\) or \(a > b\). If \(a = b\) then we can show that the greatest common divisor is the number itself, so \(a\). If \(a > b\), then we can show that the greatest common divisor of \(a-b\) and \(b\) is also the greatest common divisor of \(a\) and \(b\). However, to appropriately apply the induction hypothesis, we must first determine which of \(a-b\) and \(b\) is larger: since \(b\) is positive, we know that \(a-b < a\), and we previously assumed \(b < a\), so either can play the role of “smaller \(a\)”, but to make sufficient progress with the induction hypothesis, we need “next \(b\)” to be less than “smaller \(a\)”, so we split cases on whether \(a-b < a\) or not, arbitrarily letting the alternate case also handle \(a-b = a\).

The interesting part of the above proof is that two cases are considered, and only one of them appeals to the induction hypothesis. But the case that does not is not an \(a=0\) case as per the default template. In fact \(a = 0\) could never apply since we assumed that \(a\) is positive.

For exactness sake, it’s worth pointing out that one can perfectly well devise a principle of course-of-values induction that applies to just the positive integers, such that \(a = 1\) would be viewed as the “base case”. If we were to separate out that case, we could deduce that \(b = 1\) because it must be a positive integer, and it must be less than or equal \(a\). So in practice, the \(a = 1\) case can get handled using the exact same reasoning as the other \(a = b\) cases, which happen to circumscribe those situations where no appeal to an “induction hypothesis” is required.

So the lesson to be drawn from my first gripe is that when performing induction, if we reconceive of “base cases” to mean those cases that do not appeal to an induction hypothesis, then such cases need not coincide with the cases that involve the minimal elements of the well-founded relation (or the unique minimum element of a well-ordered relation). For many proofs the default heuristic helps, but that default is not a universal law of induction.

No base case? no problem!

To reiterate, my first problem with how course-of-values induction is taught is that the default base-case/inductive-case split may not be the most fruitful one. The second problem I have is that you may not even need such a split. Usually it is assumed that there must be some “base case”, that does not get to exploit the induction hypothesis, for how could it ever apply? But in some proofs you can always exploit the induction hypothesis. This is exactly the case in the following proof of the Principle of Infinite Descent that uses the Principle of Well-founded Induction.

Theorem: Principle of Infinite Descent \[(\forall x\in X.\, \Phi(x) \implies \exists y \in X.\, y \sqsubset x \land \Phi(y)) \implies \forall x\in X.\, \neg\Phi(x).\]

Proof. Suppose \((\forall x\in X.\, \Phi(x) \implies \exists y \in X.\, y \sqsubset x \land \Phi(y))\). Now we prove \(\forall x\in X.\, \neg\Phi(x)\) by well-founded induction on \(x\in X\). We will use as our predicate \(\Psi(x) := \neg\Phi(x)\), which is synonymous with \(\Phi(x) \implies \bot\), i.e. \(\Phi(x)\) implies contradiction. Suppose \(x \in X\), and suppose \(((\forall y \in X.\, y \sqsubset x \implies \neg\Phi(y))\) At this point it suffices to show \(\neg\Phi(x)\), for which it suffices to assume \(\Phi(x)\) and deduce a contradiction. So suppose \(\Phi(x)\). Now, applying our first assumption to this particular \(x\), we learn \(\Phi(x) \implies \exists y \in X.\, y \sqsubset x \land \Phi(y))\), and combining this with our most recent assumption, we learn that \(\exists y \in X.\, y \sqsubset x \land \Phi(y)\), so consider that \(y \in X\). Since \(y \sqsubset x\), we can use the induction hypothesis to prove \(\neg\Phi(y)\). But we also have \(\Phi(y)\), to which we can apply \(\neg\Phi(y)\) to deduce a contradiction. QED.

This proof of proof by infinite descent used course of values induction, but did not need to consider distinct cases related to \(x \in X\). There is no “base case” here, in the sense that no part of the proof must be separately completed without the help of an induction hypothesis. Now, when specialized to natural numbers, one could do so, breaking out the \(0\) case and arguing directly that there is no smaller number, without reference at all to \(\Phi(x)\). Or more nebulously, we can split on “minimal elements” versus “non-minimal elements”, but to what end?

Part way back around, without proof by contradiction

The following is a bonus: thanks to Andrej who made the observation that inspired it.

We’ve proven that if \(\sqsubset\) is well-founded, then principle of infinite descent applies. What about the reverse? The answer is “it depends.” In particular, if you prefer to work without proof by contradiction (i.e. constructively), then you can only get part way there. But with some use of proof by contradiction, you can get all the way there. In particular, we can prove that for an arbitrary binary relation \(R\) on a set \(X\), if it satisfies the statement of the principle of infinite descent, then it has no infinite descending chains. Wait, isn’t that enough? Well, no, not in a world where you want to work constructively, without proof by contradiction. I sneakily defined well-foundedness as the assertion that all chains are finite, which is a stronger statement than the assertion that there can be no infinite descending chains, so long as you avoid proof by contradiction. The positive conception is nice because it works in both contexts, whereas “no infinite descending chains”, which is often taken as the definition of well-foundedness in classical logic, does not.

Ok on to the proof. First, let’s get precise about descending chains. Define a descending chain of a binary relation \(R\) to be a partial function \(f : \mathbb{N} \rightharpoonup X\) such that for all natural numbers \(n\), if \(f(n+1)\) is defined, then so is \(f(n)\), and furthermore \(f(n+1) \mathrel{R} f(n)\). Then an infinite descending chain is a descending chain that is defined for every natural number \(n\).

Now, we can prove that if \(R\) is an arbitrary binary relation (rather than a well-founded onw) that nonetheless satisfies the letter of the principle of infinite descent, then \(R\) has no infinite descending chains.

Proof. Suppose \(R \subseteq X\times X\), and suppose that for any property \(\Phi(x)\): \[(\forall x\in X.\, \Phi(x) \implies \exists y \in X.\, y \mathrel{R} x \land \Phi(y)) \implies \forall x\in X.\, \neg\Phi(x).\] Then it suffices to assume \(f : \mathbb{N} \to X\) is an infinite descending chain and then deduce a contradiction. To do so, it further suffices to let \(\Phi(x) := \exists n\in\mathbb{N}.f(n)=x\) and prove the premise of the proof-by-infinite-descent-shaped assumption, because from that we can deduce that \(\forall x\in X.\neg\exists n\in N.\,f(n)=x\), (i.e. \(f\) is totally undefined, the empty partial function) which combined with \(x = f(0)\) and \(n = 0\) suffices to deduce a contradiction.

Let’s prove the sufficient condition: suppose \(x \in X\) and \(\Phi(x)\), i.e. that \(f(n) = x\) for some \(n \in \mathbb{N}\). Then \(\Phi(f(n+1)\) holds, and by assumption \(f(n+1) \mathrel{R} f(n)\), which can be combined to prove the existential conclusion. QED.

Coda

According to John Stillwell’s Elements of Algebra (Section 2.10), Blaise Pascal was “the first to use induction in the ’base case, induction step’ format” in a paper from 1654. But our modern understanding of induction has it’s origins in Richard Dedekind’s seminal 1888 book “Was sind und was sollen die Zahlen?” (sometimes translated “What are the numbers and what are they for?”). Many defaults—be they notation, techniques, and explanations—in mathematics have historical origins that precede a more sophisticated understanding of phenomena, and sometimes an impedance mismatch between the historical and the modern can limit our understanding (treatment of variables and functions by mathematical texts is one of my standard whipping horses, but that’s a topic for another day). Those perspectives persist because they were useful, and are still useful today, but it also helps to consider how they might best be incorporated with those refined conceptions we’ve gained since their inception.

As for the use of proof by infinite descent versus a proof like Andrej’s that argues directly from minimality, one could reasonably argue that explicitly calling out the existence of a minimal case is more intuitively graspable for some arguments. It’s not clear to me whether arguing from the perspective of “there must be one or more minimal cases” versus “you can’t go down this way forever” is more generally clear, especially once you have come to understand proof by infinite descent. Furthermore, in the case of general well-foundedness, there may not be a single smallest case, but rather a (possibly) infinite number of minimal cases, in which case calling all of them out abstractly may be less compelling than calling out a singular minimum as was the case (pun intended) for the natural numbers. Which argument is most enlightening probably depends on personal taste.

Acknowledgments

Many thanks to Andrej Bauer for feedback on this post! Thanks to folks on Hacker News who noted that the annoying proof did not address \(a=0\).


  1. For those interested in precise paper proofs, I highly recommend Leslie Lamport’s talk and paper on the topic.↩︎

Sunday 25 February 2024

Topology Episode Four: Subspaces

 \(\newcommand{\thetitle}{Topology as a theory of Touching} \newcommand{\touches}{\mathrel{\delta}} \newcommand{\ntouches}{\mathrel{\not\delta}} \newcommand{\setdiff}{\mathop{\backslash}} \newcommand{\U}{{\mathcal{U}}} \newcommand{\touchesy}{\mathrel{\delta_Y}}\)

Topological spaces are monoliths, but as with many mathematical phenomena, we can talk about fragments of them.  In this case the fragments of interest are \(\textit{subspaces}\), and there are plenty of them.  How many? Well, Given a topological space \((\U,\touches)\), which is some ``universe'' set plus it's touching relation, there is a subspace corresponding to each subset of the universe.  But the subset alone doesn't wholely determine the subspace: we also need to know it's topological structure, and it would be nice to know ``why that structure?''.

As usual, I was a bit boggled by the definition of subspaces from Munkres. The textbook says that, given a topological space, defined by a universe \(\U\) and a set of open sets \(O\), any subset \(Y \subseteq \U\) induces a subspace toplogy \((Y,O_Y)\), who's open sets \(O_Y\) can be defined as follows: \[ O_Y = \{R \cap Y \text{ for } R \in O\}  \] In words, the open sets of subspace Y are exactly those sets that can be formed by intersecting \(Y\) with some open set \(R\) of \(\U\).  Ooookay, based on the rules Munkres lays down for the open sets of a topological space, this sure seems to define a topological space over Y:

  1. The empty set is still a member of \(O_Y\) since \(\emptyset \in O\) and  \(\emptyset \cap Y = \emptyset\). 
  2. The entire set \(Y\) is open since \(\U\) is open and \(\U \cap Y = Y\).
  3. \(O_Y\) is closed over arbitrary unions of its elements.  To see this, observe that in general, set intersection distributes through arbitrary set union, i.e., for any set of sets \(Q\) and any set \(A\), \((\bigcup_{q \in Q}q) \cap A = \bigcup_{q \in Q} q \cap A\).  It follows, then, that union of some open sets in \(\U\) intersected with \(Y\) forms the same set as the union of the intersection of each relevant open set with \(Y\), each of which is an element of \(O_Y\)
  4. A similar argument shows that \(O_Y\) is closed over finite intersection of open sets.

Clearly this topological space \((Y,O_Y)\) is derived from \((U,O)\).  But, as usual, even though I can \emph{prove} that the topology that we just constructed on \(Y\) is indeed a topological space, and \(Y\) is indeed a subset of \(\U\), and the construction clearly involved the topology imposed on \(\U\) by its open sets \(O\), I come away with very little intuition for why this might be the right thing to do.  Can we fix this?  I think we can! To make progress, we recast subspaces in terms of touching, and I'm pleased to report that the results are pretty satisfying.  I think there is more insight to uncover, though, but I'll save that for a future post.

Recall that a set is open if and only if every element of that set does not touch ``the outside'' (i.e. the complement of the set).  So then, given some \(Y \subseteq \U\), what kind of touching relation does the intersection construction above correspond to?  I'm happy to report that its description is just as, if not moreso, compelling: If \(\touches\) is the touching relation corresponding to open sets \(O\), then the touching relation \(\touches_Y\) corresponding to open sets \(O_Y\) is just the restriction of \(\touches\) to exactly those pairs \(a \touches A\) such that \(a \in Y\) and \(A \subseteq Y\). In math: \[\forall a\in Y, A \subseteq Y.\, a \touches_Y A \iff a \touches A.\]

I find this \(\textit{very}\) satisfying, because in the open set-based definition, the new open sets \(O_Y\) are related but \emph{not the same as} the open sets \(O\).  Some subsets of \(Y\) were \emph{not} open in \(\U\), but now are in \(Y\).  You have to do some work, and keep in your mind the original topological space to understand the new one.  In contrast, the touching relation for a subspace has \emph{exactly the same touching relationships} as the superspace. Nothing changed, we just restricted our perspective.

Ok, we're actually on the hook to prove that \(O_Y\) and \(\touchesy\) define the same topological space.  Honest confession: it took me a looong time to prove this to my satisfaction, and as is often the case, when I finally got one (of too many) proofs to go through, I was miffed that it took me so long (on the upside I learned \emph{a bunch} from getting multiple proofs to go through: each had a different structure which highlighted a different insight).

Here is the most expedient proof, and I think it is still insightful.  First, we flip to closed sets: \[ C_Y = \{Y \setdiff (R \cap Y) \text{ for } R \in O\} \]

Every closed set is the complement of some open set, so we build on the definition of \(O_Y\).  From here, we can apply some some set-theoretic algebra, keeping in mind that \(Y\) is a subset of \(\U\): \[ Y \setdiff (R \cap Y) = (\U \cap Y) \setdiff (R \cap Y) = (\U \setdiff R) \cap Y \]

And since \(R\) is open in \(\U\), that means that \(\U \setdiff R\) is closed in \(\U\)!  So, taking \(C\) as the closed sets of \(\U\), \[  C_Y = \{R \cap Y \text{ for } R \in C\}. \]

It's kind of comforting that we get the closed sets of the subspace using the same construction as we used to get its open sets, but we still have this annoying property that some closed subsets of \(Y\) are not closed subsets of \(\U\).  This is where switching to reasoning about the \emph{closure} of an arbitrary subset of \(Y\) pays great dividends.  By definition of touching, if \(a \in Y\) and \(A \subseteq Y\), then \(a \touchesy A \iff a \in \mathop{\textit{closure}_Y}(A)\), and \(a \in \mathop{\textit{closure}_Y}(A) \iff a \in \mathop{\textit{closure}}(A)\) To summarize, the subspace topology does not preserve open sets, nor does it preserve closed sets, but it \emph{does} preserve \emph{and reflect} touching.

A Directed Graph Example

To get some sense of how subspaces work, let's see how directed graphs and topologies induced by them manifest subspacing.  For this example, we return to the ``Fallen P'' graph from my previous post on connectivity:

and we focus on Ahlborn's default topology, wherein a node \(a\) of the graph touches a set of nodes \(A\) if and only if the \(a\) is reachable from \(A\).  So for instance, in the Fallen P graph, \(C\touches\{A,D\}\).

Adapting the idea of subspace to Ahlborn's default topology, a subspace is incuded by some subset of the nodes.  However, directed graphs themselves have a notion of \emph{subgraph}, which is some subset of the nodes of the original graph combined with some subset of the edges among the kept nodes.

A particularly interesting kind of subgraph is the \(\textit{induced subgraph}\) which is what you get when you keep certain nodes of the original graph, and then proceed to retain all of the edges of the original graph that span keptnodes.  For instance, if we keep nodes \(A,C,D\), having discarded only node \(B\), we get the following induced graph.

The only remaining edge in the graph connects \(C\) to \(D\) because every other edge involved \(B\) which has been voted out of the archipelago.

Things get interesting, however, when we consider the touching relation of the \(\textit{subspace}\) induced by \(A,C,D\), rather than the edges and paths of the similarly-induced \(\textit{subgraph}\).  Since all still-relevant touching relationships are preserved by taking a subspace, we \(\textit{still}\) have \(C\touches\{A,D\}\), which means intuitively that \(C\) is reachable from \(\{A,D\}\).  At the least, this means that the Ahlborn default topology induced by an induced subgraph \emph{need not} coincide with the subspace topology induced by considering the same nodes of the original graph and its topology.  Curious!


To be honest this is contrary to what I had originally guessed would happen. But on further consideration this makes sense.  In my post on connectivity, we observed that if you add new edges to a graph that represent every self-loop and path among nodes, i.e., ensure that there is an edge from each node to each \(\textit{reachable}\) node:

then the resulting graph, which I'll call the ``Monster P'' graph, induces the same Ahlborn default topology as the original.  So the topology does not distinguish between edges and paths.  There is no ``metric'' for connectivity: \(0\) to \(n\) ``hops'' along edges, for any \(n\), are indistinguishable.

Now consider the subgraph induced by \(A,C,D\) from  this monstrosity:


Since the ``Monster P'' graph reifies all instances of reachability as direct edges, the present induced subgraph preserves all of the reachability relationships from the ``Fallen P'' graph.  As such, its induced Ahlborn default space is the same as the subspace topology induced by \(A,C,D\) from the original ``Fallen P'''s Ahlborn default topology.  So the sense in which the subspace construction preserves the topological structure of the retained nodes seems to be fundamentally about preserving reachability, with no conception of which actual edges of the original graph contribute to that reachability.  An interesting abstraction!


Saturday 13 May 2023

Topology Episode 3: Connectivity and Directed Graphs

\( \newcommand{\touches}{\mathrel{\delta}} \newcommand{\ntouches}{\mathrel{\not\delta}} \newcommand{\setdiff}{\mathop{\backslash}} \newcommand{\U}{\mathcal{U}} \newcommand{\inv}[1]{\U \setdiff #1} \newcommand{\R}{\mathbb{R}} \)

In our last episode, I introduced the idea of topological connectedness and related it to the mean value theorem from real analysis.  It's kind of cool that that theorem does not strictly depend on metric spaces, but rather depends on ideas of touching, in particular connectedness.

It makes a lot of sense that in a standard topology text, topological concepts are often demonstrated in the context of analysis, since historically point-set topology was a way of abstracting analysis, until at some point it became its own discipline.  But it can be nice to see some of these ideas at play in other applications.

This is a good opportunity to treat a structure is commonly used in computer science and that most folks would consider to have "topological structure", but that we do not often describe in the language of point-set topology:  directed graphs.  If you are like me, then you have some intuition that nodes and edges in a directed graph exhibit a notion of "distanceless connectivity", in the sense that we do not necessarily associate a distance metric with each edge, but we do tend to  think that two nodes that have an edge between them are "connected", and that a node that's a minimum of two hops away is also connected, but "further" than a node that's one hop away.   So how do these intuitions play out in topology?  I'll be honest: I started writing this post with intuitions that suggested formal conjectures that turned to be wrong!  I'll try to swallow my pride and point out my errors along the way or in summary.  For me, engaging with the formalism of topology in terms of touching, and reckoning with my intuitions about "the topological structure of directed graphs" taught me a lot about topology itself and what intuitions I have about identifying topological structure in graphs. And boy have I made a lot of mistakes along the way!

Connectivity, Again

First, recall the standard definition of connected space \(\U\): 

\[\begin{multline*} \forall A \subseteq \U.\, (\exists x \in \U.\,x \in A) \wedge (\exists x \in \U.\,x \in \inv{A}) \implies  \\ (\exists a \in A.\, a \touches (\U \setdiff A)) \vee (\exists x \in (\U \setdiff A).\, x \touches A).  \end{multline*} \]

The premise is about splitting the space into two non-empty parts: we don't really care about that here.  We instead care about the consequent, which imposes at least one touching constraints.  Topology books give a more compact definition of connectivity in terms of the impossibility of splitting the space into two disjoint non-empty open sets.  In my view this definition is elegant in expression but obscures the disjunction that naturally appears when you ask what is possible rather than what is impossible.  I find this framing informative.  

And to be honest, I mis-translated it at first, using conjunction rather than disjunction, which demonstrates a shortcoming in my understanding!  The key is that in any split at most one of the sets can be open, which implies that touching need not be "two-way" across the divide. I missed this fundamental asymmetry in connectedness, and it's easy to ignore this distinction in geometry  (the study of "land measurement" where measurement determines proximity), thus real analysis, but one-way connectedness is important in a directed graph, where directed flow, not distance, determines connectivity.  In a space where touching captures graph structure, being a connected component ought not require that all nodes are directly connected to one another.  As it turns out, topological connectedness can account for that.  In my view, seeing how topological connectivity can reflect graph connectivity sparks insights: it can help you more clearly grasp some of the richness (and limitations) of this topological concept, and of topology itself. 

Directed Graph Topologies

To help ground this discussion, let's consider a particular directed graph.  This one is tractable but not totally trivial:

The graph has 4 nodes and 4 edges, features a cycle, and also features a node to which one cannot return by following edges.  For lack of a better name, I'll call it "The Fallen P Graph" because it looks to me like the letter P has stumbled and fallen, but now opts to rest rather than serve its lexical master.  

Now for some topology!  A great resource for topologies on directed graphs is a technical report from 1964 by Thomas J. Ahlborn (I think that this was his MSc. thesis?), which proposes 3 different ways to assign topologies to directed graphs, though focusing primarily on one (which I'll refer to as "preferred" or "default") topology.  It was followed up by a paper from 1968 that proves more results about the preferred 1964 topology.  That Ahlborn could conceive of 3 topologies for directed graphs might hint that topological structure is not a singular thing: you can analyze different notions of proximity present in a single class of objects (in this case directed graphs).

I will start by recalling Ahlborn's first topological space, defined in terms of open sets, and then transform the definition into one phrased in terms of touching.  The strategy I use to transform a topological space defined in terms of open sets (the norm) into a topological space defined in terms of touching involves the following three observations:
  1. A set \(A\) is open if and only if the set \(\U \setdiff A\) is closed;
  2. The closure of a set \(B\) is the smallest (in terms of set containment) set \(C\) such that \(B \subseteq C\) and \(C\) is closed;
  3. An element \(a\) touches a set \(A\) if and only if \(a\) is an element of the closure of \(A\).
In general, this is kind of a pain, and getting it right requires being ruthlessly logical (says the guy who's definitely gotten it wrong by taking shortcuts!).  But I find the insight gained by characterizing touching to be psychologically helpful, even more so than thinking about the entire closure in aggregate.  Hopefully you'll see what I mean as we go along.

As a side note, I think it's useful to also keep in mind that a set \(A\) is open if and only if each element \(a\) of \(A\) does not touch the set complement (i.e., \(a \ntouches \U \setdiff A\)).   I don't find this helpful for deriving touching (at least doing so leads me more readily to error), but I do find it helpful for thinking about how open sets directly relate to touching.  Perhaps if I focused on the interior of a set and characterizing apartness (i.e. non-touching), this would be the better way to go.
 
For the following definitions of topologies on directed graphs, let \(\U\) be the set of all the nodes of a graph, \(a\) a single node, and \(A\) some (possibly empty) set of nodes. 


Definition 1a (Ahlborn's default/preferred).   A set of nodes \(A\) is open if and only if there is no edge from any node \(x \in \U \setdiff A\) to any node \(a \in  A\).  

Metaphorically, you can think of this an open set as a castle on lock-down that ensures no entry from the outside world!  Let's consider the open sets of The Fallen P Graph.  Since the graph has 4 nodes, there are \(2^4 = 16\) subsets of nodes, including all of the nodes (\(\U\)) and none of them (\(\emptyset\)).  Which of the 16 are open? Conveniently 16 is a small-ish number (unlike the number of real numbers between 0 and 1, for instance), so we can just check all of them.  Because I'm a pedant, I wrote down all 16 subsets on paper and started checking them, one by one, but if I eyeballed it, I might have been able to just write them down (though to be honest, I probably would have forgotten \(\emptyset\) even though it's always open).  Here are the 3 (count 'em, three) open sets:  \(\emptyset\), \(\{A\}\), and \(\{A,B,C,D\}\).  
Wow, that's...underwhelming.  Two of those sets (\(\U\) and \(\emptyset\) are always open.  So we get one singleton as a prize: you can confirm that no edges from outside of  \(\{A\}\) hit the only node in the set.  Personally I don't find that this enumeration of open sets gives me all that much insight.

So instead let's get back to my favourite lens on topology: what does Ahlborn's preferred topology imply for touching?  Well, let's follow the steps above:

Step 1: 

If \(A\) is open then \(U \setdiff A\) is closed.  How do we describe closed positively, rather than just "not open"?  Well, a set \(A\) is closed if every edge from a point in \(A\) arrives at some other point in \(A\). Interesting: my intuitions about "graph topology" quite like this definition of closed sets.  Of course intuitions are a fickle guide, but this proffers a feeling of comfort---moreso than the definition of open despite their being interchangeable as definitions of the topology.

Technically we have two proof obligations.  First we must prove that if \(A\) is open then \(\U \setdiff A\) has this property, which seems not bad since the latter set is everything else so where the heck else is an edge going to land?  Second, we must also prove that if \(A\) satisfies this property, then  \(U \setdiff A\) is open:  also not bad since we've ensured that \(A\) is everything else as far as \(U \setdiff A\) is concerned, and since every edge in \(A\) lands in \(A\), none of them land in \(U \setdiff A\) (isn't the law of excluded middle grand?).

Since The Fallen P has 3 open sets, it also has 3 closed sets: \(\emptyset\), \(\{B,C,D\}\), and  \(\{A,B,C,D\}\).  You can confirm that any edge that starts at a node in \(\{B,C,D\}\) ends in that same set.  Don't read too much into this example, though:  a more complex directed graph can produce some more interesting open/closed bifurcations of the set of all nodes \(\U\).

Step 2:

Here's where things get fun.  Suppose \(B\) is a set.  What is its closure, as in, what is the smallest set \(C\) such that every node in \(B\) is in \(C\) (that's the definition of set containment), and furthermore, every edge that starts in \(C\) ends in \(C\)?  I find it helpful here to  think "algorithmically" for a second:  First, add all the nodes from \(B\) into our potential \(C\).  Next, suppose we find every edge that starts in our potential \(C\) and ends outside of potential \(C\), and just add those new nodes to \(C\).  Cool, now all the nodes from \(B\) are in \(C\) and end in \(C\).  But what about these new nodes?  Some edges that start from new nodes may end in \(C\) but others may not.  So we should add new new nodes, based on following edges that take us outside of \(C\).  The same thing happens, so we need to add new new new nodes.  And so on.  Ideally you see where this is going:  the closure of \(B\) is the set of nodes that includes \(B\) itself and any node \(x\) that can be reached in any number of steps from a node in \(B\). Another way to put this latter part is to say that there is a path from some node in \(B\) to \(x\).  The whole thing can be succinctly summarized by saying \(x\) is reachable from \(B\).   

So once we have our closed \(C\) we can be sure that every node \(x \in \U \setdiff C\) is not reachable from \(C\), so there is no single edge from \(C\) into \(\U \setdiff C\), so the latter is open.  This all seems to be working out.

Returning to The Fallen P, we can give some examples of closure.  First the boring ones, the closure of each closed set, like \(\{B,C,D\}\) is itself:  we need not add any extra nodes to this set to make it closed.  Another way to put it is that the set itself is the smallest set that contains it and is closed.   As for non-boring ones, there are some interesting phenomena.  The closure of \(\{B,D\}\) is \(\{B,C,D\}\) because of the two closed sets that contain \(\{B,D\}\)---namely \(\{B,C,D\}\) and \(\{A,B,C,D\}\)---\(\{B,C,D\}\) is the smallest.  On the other hand, the closure of any set that contains \(A\) is \(\{A,B,C,D\}\) because the latter is the only closed set that contains the former.

Step 3:

The hard work has been done, and now we can state our definition by determining what it means for a node to be an element of the closure of a set of nodes.  That gives us a new definition

Definition 1b (Ahlborn's default/preferred).  \(a \touches A\) if and only if \(a \in A\) or there exists a path from some node in \(A\) to \(a\).

So in Ahlborn's topology, touching boils down to a node either being an element of a set or being reachable from the set.  Cool! Spelling this out was very helpful for me because it was not obvious to me that touching could partly be determined by nodes that are not themselves elements of \(A\).  Because of paths, a node could touch \(A\) not because of an edge from \(A\) to it, but rather because some other node not in \(A\) has an edge to it.  So it's useful to keep the idea of closure in mind when you think about touching, even though touching is the property that continuous functions preserve, not closure!  

This topology has some interesting features worth pondering. First, notice that \(a \touches \{a\}\) even if \(a\) has no edge from \(a\) to \(a\).  So touching has a fundamental "reflexivity" to it:  a graph with a self-loop has the same default topology as one that does not.  In particular, the following graph has the same Ahlborn-preferred topology as The Fallen P:


 
Second, since non-trivial touching is characterized by the mere existence of paths, having multiple paths to a node makes no difference in the topology imposed by the graph.  Furthermore, the touching relationship tells you nothing about how many edges one must traverse on the path from one node to another node, so there is no notion of "distance" either.   In particular, the following graph also has the same Ahlborn-preferred topology as The Fallen P:

In Fallen P, it takes "two hops" to get from A to C;  In this graph it only takes "one hop".  Topology don't give a hoot!

In fact, this particular topology can be determined from just the reflexive-transitive closure of edges.   
In short, the Ahlborn-preferred topology abstracts the reachability properties of a graph from the rest of the directed graph structure.  It's important to remember that the notion of "reachability" here always claims that a node is always reachable from itself, path or no path.

We've talked about open sets, closed sets, and closure in this topology.  I also find it useful to take a moment to consider two other common topological features:
  1. The interior of a set \(A\) is the largest (again by set containment) set \(B \subseteq A\) for which no edges start in \(\U \setdiff A\) and end in \(B\).  This looks a lot like the definition of an open set, and that's no coincidence because \(B\) is open.  So taking the interior is like pulling up the drawbridges and closing the gates:  you can hear the sound of battle outside the castle walls, being waged by those who were who are left on the outside  ("At the going down of the sun and in the morning, We will remember them").  So the interior of a set of nodes is those nodes that are not reachable from outside the set.  For example, in The Fallen P, the interior of \(\{A,B\}\) is \(\{A\}\).  
  2. The boundary of a set \(A\) is those elements of \(\U\) that touch both \(A\) and \(\U \setdiff A\).  In this topology, this is the set of nodes that are reachable from inside and outside the set.  For example, in the graph A --> B <-- C,  \(\{B\}\) is the boundary of both  \(\{A\}\)  and  \(\{C\}\). In the Fallen P,  \(\{B\}\) is the boundary of a number of subsets, including  \(\{A\}\).  Can you name all the other subsets of \(\U\) for which  \(\{B\}\) is the boundary?

Finally, a useful exercise for you would be to revisit the touching axioms and confirm that they all hold for each of these definitions.  In fact the "transitivity" case of touching helps explain why the definition of touching for Ahlborn's preferred topology is in terms of paths rather than edges, so this is well worth considering. 

Connectivity

Ahlborn relates topological connectivity in his default topology to graph structure, connecting it to weak connectivity.  A directed graph is weakly connected if every node is reachable from every other node so long as you ignore the directionality of the arrows.  In other words, a directed graph is weakly connected if its corresponding undirected graph is connected.  He relates other graph properties to topology as well!  This is comforting and supports the idea that the topology does preserve some interesting properties of the graph which we can then call "topological" properties.  But are these exactly all and only the topological properties of a directed graph?  Not so much!  That Ahlborn provides demonstrates other topologies says as much...

Two More Topologies

Here I will just give the touching-based definitions of Ahlborn's other two topologies.

Definition 2 (Ahlborn's \(\gamma\)).  \(a \touches A\) if and only if \(a \in A\) or there exists a path from \(a\) to some node in \(A\).

Definition 3 (Ahlborn's \(\delta\)).   This one's a tad peculiar.  First define an anti-edge from \(a\) to \(b\) to be the absence of such an edge in a graph.  Second, define an anti-path from \(a\) to \(b\) to be a transitive sequence of anti-edges starting at \(a\) and ending at \(b\).  Then:
\(a \touches A\) if and only if \(a \in A\) or there exists an anti-path from some node in \(A\) to \(a\).

It's useful to note that given a graph \(G\), the \(\gamma\) topology is the same as the default topology applied to the reversed graph \(G'\) that arises by reversing all of the edges of \(G\).  So there is a duality between the default and \(\gamma\) topologies, induced by how you treat the directedness of edges. 

Similarly, given a graph \(G\), the \(\delta\) topology is the same as the default topology applied to the shadow graph \(G*\) that arises by replacing all of the edges of \(G\) with all of the edges absent from \(G\).  This implies a different sort of duality between the default and \(\delta\) topologies, induced by the presence and absence of edges. 
 
Let's see how these new topologies play out for Fallen P, repeated here for convenience:



For \(\gamma\), we can define touching by introducing a touching relation for every edge, i.e.: \[ A \touches \{B\},  B \touches \{C\},  C \touches \{D\},  D \touches \{C\}, \] and then expand the touched sets every arbitrary way, and expand the touched nodes transitively!  Don't make me write out all the cases... In any case, this topology is the same as the default topology on "Reverse Fallen P":

For \(\delta\), we consider the anti-edges, but in a style analogous to the default topology, i.e.: \[ B \touches \{A\}, C \touches \{A\}, D \touches \{A\},  etc. \] and expand accordingly. This topology is the same as the default topology on "Shadow Fallen P":


So many topologies!  Which one is the right one?  I expect that the answer, as usual is "it depends":  on what questions about a graph you care about, and can be isolated in one of these topological structures.  I expect that there are other topologies that might be devised and applied to directed graphs.  For instance, why not have a topology assigned to the edges of a directed graph?  

So there are lots of "topological" properties of a directed graph, in the sense that they can be viewed through the lens of topology, but it's not clear to me what the limit of them are or could be, either theoretically or pragmatically.  

Examples/Exercises

Here's a few more simple graphs for you to ponder and enumerate open sets, closed sets, touching, boundaries, and what have you.  The parentheticals include graph properties defined by Ahlborn but not explained here: consult Ahlborn's tech report for details.

Example 1: A --> B

This graph has two nodes and one edge.  We can write out the cases of touching for the different topologies, omitting the cases that are true simply because touching extends set membership: 



Example 2: A <-- B --> C  (weakly connected but not unilaterally)

Example 3: A    B --> C (disconnected) 


Directed Graphs of Topologies?

It's worth noting that we can ask the reverse question:  do topologies correspond to directed graphs?  Well, in general no, but I think that any arbitrary touching relation can be seen as yielding a special kind of hyper-graph, where one side of every edge is a singleton set (you'll have to decide the direction, it's the difference between the default and \(\gamma\) topologies)!  I would not be surprised if this perspective, generalized to arbitrary hypergraphs, might be related to proximity and near sets
A topic to explore for another day.

Acknowledgments

I have to thank Michael Betancourt for inspiring these investigations and this post, and providing wonderful feedback and suggestions.  His efforts to explain some essential aspects of topology to statistical modeling students sparked a wide-ranging conversation that led me to ponder directed graphs (among other things).

History

May 13, 2023 - First Posted
May 16, 2023 - GraphViz'd the graphs
June 7, 2023 - Incorporated a running example, fixed broken definition of \(\delta\) topology.

Tuesday 25 October 2022

Topology as Touching, Episode 2: Connected Topological Spaces

\( \newcommand{\touches}{\mathrel{\delta}} \newcommand{\ntouches}{\mathrel{\not\delta}} \newcommand{\setdiff}{\mathop{\backslash}} \newcommand{\U}{\mathcal{U}} \newcommand{\inv}[1]{\U \setdiff #1} \newcommand{\R}{\mathbb{R}} \)

It's been years since my last (a.k.a. first) post, where I described a presentation of some basic concepts in point set topology in terms of a concept I called "touching", though I believe that it also shows up in the literature under the name nearness.   

Since writing that post, I've occasionally revisited this conception of topology, and revisited my old grad school textbook (Munkres' Topology, Second Edition), re-interpreting the text through the lens of touching.  I've found the experience quite enlightening: I feel like I keep acquiring more comfort and intuition around topology, moreso than I would just reading things as-is.  Along the way I've written what looks like a bunch of notes, but they're in my chicken-scratch handwriting and relegated to a folder of notes on my bookshelf.  Rather than let those explorations yellow and decay, I'd like to share some of them here in case they're helpful to others.  Plus they give me a chance to clarify my thinking and maybe get some corrections or insights from others.  

A word of warning: reading these notes is probably not the way to learn topology.  I'll only provide so much motivation, and things will not necessarily arise in the same order as Munkres' textbook, nor in an order that explains why a given concept or theorem arises.  I'll try to leave some breadcrumbs or pointers into Munkres at least.  Suggestions/questions/comments welcome! 

For a quick review, in the last post [link?] I introduced the idea of a topological space as a set \(\U\), the universe, paired with an binary relation \(x \touches A\) pronounced "\(x\) touches \(A\)" that relates an element \(x \in \U\) to a subset \(A \subseteq \U\) and satisfies some axioms:

  1. \(x \ntouches {\emptyset} \)
  2. If \(x \in A\) then \(x \touches A\)
  3. If \(x \touches A\) and \(A  = B \cup C\) then \(x \touches B\) or \(x \touches C\)
  4. If \(x \touches A\) and \(a \touches B\) for all \(a \in A\), then \(x \touches B\)

I gave some intuition for the axioms,  which lean heavily on how we think about space.

 Then I introduced the idea of continuous functions, which is a common topic in calculus, but we generalized it to this abstract notion of touching.  However the notion of continuous function from calculus is just a special case of the topological one for a particular topological space defined over the set of real numbers, or pairs of real numbers in 2-dimensions, or triples of real numbers in 3-dimensions, and so on.  We used the 1-dimensional version in the last post, but it generalizes.  We can call this the standard topology over Euclidean space (which is named after Euclid, the fellow who documented the axioms of (Euclidean) geometry that have influence mathematics for many many centuries).

I also introduced some common topological concepts, which give some general, though abstract, definitions to terminology that we might know from geometry or from talking about the world around us.  Terms like interior, boundary, and friends are applied more broadly to worlds where touching is given a loose interpretation in terms of these axioms. 

To continue, let's introduce some more topological terminology/concepts, and definitions for those.  This stuff comes up a lot as the topic develops, and it's fun (if you're like me) to think about the intuition behind the definitions.  In fact, I recall having some fun figuring out how to translate more standard definitions into touching-based definitions that fit my intuitions well.  I'm a strong believer in the importance of intuition in mathematics.  Two characterizations of a concept may be equivalent as far as mathematics goes (in the sense that any statement that you can prove using one can also be proved using another), but the effect of intuition on the human mind and how it helps you learn more mathematics or figure out how to apply a concept (or modify it to suit a different problem, or develop another concept by analogy) is priceless.   For me, as I've already said, the definitions in terms of open sets, which are standard, don't connect well with my intuitions, and talking to others I know I'm not alone.  Maybe these alternate definitions will help.

Connectedness

Here's another core concept: let's define what it means for a topological space \(\U\) to be connected. As Munkres explains at the beginning of Chapter 3, connectedness is a significant topological property of closed intervals of real numbers (when given the topology that we implicitly and intuitively associate with real numbers: see below).  A famous theorem in calculus, the intermediate value theorem, can be proven in an abstract way within topology, and helps clarify which properties of real numbers and continuous functions really make the theorem go through.  Remember that topology abstracts the concept of continuous functions away from numbers, so that it can be applied to other topological spaces, but some theorems that were originally considered in calculus can still be stated and proven.  So topology makes it possible to capture properties of mathematical objects and functions that are fundamentally about their topology (about the relevant notions of touching and the continuous functions that arise from them).  Some facts of calculus go beyond topology, for instance if they are fundamentally about distance (between points) and not just about touching.  This is the sense in which topology and calculus are not the same topic of study, but many theorems in calculus generalize to theorems about any topological space that shares enough in common with the real numbers given their standard topology.

Anyway, let me start to explain connectedness with an analogy: my childhood home.  Upon arriving home in a car, I would have to exit the garage, go around the side of the house and then enter through the back door, which led to the laundry room.  Once there, I could walk from any room of the house into any other room of the house except for the garage.  If I wanted to get to the garage, I'd have to go outside and then in through the garage door.  So in my mind, the garage was not "connected" to the inside of the rest of the house, even though the garage was attached by walls to the house.  You could say that here I'm focused on the "topology" of the "walkable space" of my home, rather than the "topology" of the blueprints, walls and all.

Now let's introduce the formal definition of connectedness and then relate it to my childhood home.  A space \(\U\) is connected if and only if:

\[\begin{multline*} \forall A \subseteq \U.\, (\exists x \in \U.\,x \in A) \wedge (\exists x \in \U.\,x \in \inv{A}) \implies  \\ (\exists a \in A.\, a \touches (\U \setdiff A)) \vee (\exists x \in (\U \setdiff A).\, x \touches A).  \end{multline*} \]

[ed, May 11, 2023: previously I got this definition wrong:  there was a \(\wedge\) where the \(\vee\) now is.  Ack!]

That's a mouthful!  What does it say?  Let's take it piece by piece and interpret it as literally as we can.  It says that the entire space \(\U\) is connected if every subset \(A\) that is neither empty (\(\exists x \in \U.\,x \in A\)) nor the entire universe (\(\exists x \in \U.\,x \in \inv{A}\))  has an element that touches the outside (\(\exists a \in A.\, a \touches (\U \setdiff A)\)) or is touched by some element from the outside (\(\exists x \in (\U \setdiff A).\, x \touches A\)).

Phew!  Now for a more colloquial interpretation.  No matter which way we split the universe into two non-empty sets, \(A\) and \(\inv{A}\), we can find an element of one of those sets that touches the other.  So we're taking this notion of touching as a means to get from one side of the universe to the other!

Going back to my house, we can argue that my garage is not connected to the rest of the house, because if we take the walkable part of the garage as one set and the walkable part of the rest of the house as another set (it does not matter which one is \(A\) and which one is its inverse \(\inv{A}\)) then neither touches the other in that there is no entry "point" from one that touches anywhere in the other.  

Things can get a bit more subtle though.  If I had lived in a strange castle with a trap door into the dungeon but no stairs for getting out,  then our definition of connectedness would say that the dungeon is connected to the rest of the castle, even though I could not freely move back and forth. 

Mind you, this example is just an analogy to get you thinking about the idea of connectedness, but we can also consider some actual mathematical examples.  

First, an example of a connected space is a closed range of numbers:

Let \( \U = [5,10] = \lbrace x \in \R \mid 5 \leq x \leq 10\rbrace \), and \(x \touches A\) if and only if \( \mathit{glb} \{ \lvert a - x \rvert \mid a \in A \} = 0. \) 

We saw a similar definition of touching in our last episode, which in words says that \(x\) touches \(A\) if the greatest lower bound on distance between \(x\) and any element of \(A\) is zero.   This notion of touching is super pervasive when dealing with numbers, be they rational, real, and can be extended to tuples of numbers too with a little work. For now I'll call it the standard definition, though I'll only use it with subsets of real numbers for now.

Anyway, this space, which is a contiguous range of real numbers, is connected.  No matter how we split it in two, there will always be some element in each set that touches the other.  Proving this involves considering all possible splits, and showing that there must always be at least one element that touches across the divide.  And one can come up with a lot of crazy ways to allocate elements of [5,10] into 2 sets. 

 For instance, you could split the set into the (infinite) set of rational numbers (i.e. numbers that can be written as a ratio of natural numbers) between 5 and 10 and the (infinite) set of irrational numbers (numbers that cannot be written as ratios) between 5 and 10.  For this particular split, I would pick some rational number in the middle, say 7, and prove that it touches the set of irrational numbers between 5 and 7, and thus touches the set of irrational numbers between 5 and 10 (since adding more stuff to a touched set preserves touching).   But you want to prove this for all possible splits.  Check out Munkres for more on how to really do it: this discussion is just to get you thinking about what you would need to prove in general using one particular (nutty) split. 

For our second example, consider the set  \(\U = [1,4] \cup [5,10]\), which combines two closed intervals: all the numbers from 1 to 4, and all the numbers from 5 to 10, and use the standard definition of touching for numbers.  It is much simpler to show that this set is not connected than it was to show that the previous example was connected:  we just need to exhibit one split and show that for each of the two sets, every element of one set does not touch the other.  We split this one set into the two closed intervals, and pick an element \(a \in [1,4]\).  Now we can show that \(a \ntouches [5,10]\):  assume that \(a \touches [5,10]\) and prove that this implies a contradiction (to be clear, this strategy is proof of a negation, not proof by contradiction):

1. Suppose \(a \in [1,4]\) and \(a \touches [5,10]\);
2. Then by definition of touching, 0 is the greatest lower bound of distances between \(a\) and any element of \([5,10]\);
3. But since 4 is the largest number that \(a\) could be, then \(5-4 = 1\) is a lowerbound of any \(a \in [1,4]\) distance.
4. From 0 being the greatest lower bound distance and 1 being greater than 0 and also a lowerbound distance, we deduce a contradiction!

We can play the same game in the other direction to complete the proof.

There's plenty more to the concept of connectedness.  For instance, why is it that an entire space is connected, rather than some subset of a topological space?  Well, there is a way to take the former concept and use it to define the latter, but we don't have enough tools yet (in particular, how to define a subspace of a topological space), but we can get there once we do :).

Parting Quiz Question:  take \(\U = [5,10]\) but take touching to be \(a \touches A\) if and only if \(a \in A\).  In other words, assume the discrete topology (introduced last post). Is this topological space connected? Why or why not?


First Posted October 25, 2022





Friday 22 June 2018

Topology as a theory of touching

\(
\newcommand{\touches}{\mathrel{\delta}}
\newcommand{\ntouches}{\mathrel{\not\delta}}
\newcommand{\setdiff}{\mathop{\backslash}}
\newcommand{\U}{{\mathcal{U}}}
\)

On the advice of a friend in grad school, I took a topology course, and to my surprise I have found many of the ideas there relevant to programming languages research.  For instance, Dana Scott's theory of domains was heavily rooted in ideas from topology.  In particular, Scott took the idea of limits very seriously in talking about the phenomenon of a computational process converging to an idealized result.  For related reasons, you can think about the theory of abstract interpretation using concepts from topology as well, where a lattice of abstract or concrete elements has a "spatial" interpretation associated to it:  the more precise your analysis results, the "closer" you are to some ground truth.

While some of the intuitions of topology as an abstract notion of space feel right, I always found the formal developments confusing.  A topology is typically axiomatized as a set of "open sets", which are kind of like open ranges \((n,m)\) of numbers, where \((n,m)\) refers to all of the real numbers \(x\) such that \(n < x < m\), so not including \(n\) or \(m\).

From there things get a bit confusing.  For example, an arbitrary union of open sets is sure to be open, but only a finite intersection of open sets is guaranteed to be open.  That's not particularly obvious to me!  Even worse for me is the definition of a continuous function from one topological space.  This is typically defined in terms of the inverse image of open sets being open.  If you've ever seen the \(\varepsilon\)-\(\delta\) definition of continuous functions in a calculus class, then this seems like some kind of generalization of the idea.  But I like to think of functions in terms of what properties they preserve in the forward direction, and topology is typically not presented that way.

Recently, though, I found out that there is another way of describing topological spaces, which does not appear much in the literature, but seems to fit my intuitions about topology as an abstract study of space.  The usual presentation is heavily influenced by accidents of history (I highly recommend this article for the details), but we can start from a different starting point, and still arrive at the story that appears in usual textbooks.  Naturally I have no plans to write a book about topology, but since others have been interested in this perspective, I'm sharing it here.

A bit of intuition: Zeno's Paradox

To motivate this presentation, I'll take inspiration from Zeno's Paradox of The Dichotomy.  Consider the set of rational numbers \(D = \{\frac{1}{n} \mid n \in \mathbb{N}\}\) where \(\mathbb{N}\) refers to the natural numbers \(\{0,1,2,\dots\}\). Is \(0\) an element of that set?  Well, no, because there is no natural number \(n\) whose reciprocal \(\frac{1}{n}\) is equal to \(0\). 

However, we can, and do, think about numbers as having some notion of spatiality to them:  some numbers are closer to one another than others.  Usually we formalize this as the "distance" between two numbers.  If we take this idea of space seriously, then \(0\) is not in the set \(D\) (i.e., \(0 \not\in D\)), but it's almost in it. In particular, you could say that \(0\) is touching \(D\).  We'll write this \(0 \touches D\), using the greek letter "delta" \(\touches\) as our symbol because it's almost in, i.e. almost the greek letter "epsilon" \(\in\) (groan!).  FWIW I didn't make up this choice of symbol, but I can't be sure about the original justification for it.

Anyway this is our intuition for touching, but can we be more precise?  Well in the case of rational numbers and sets of rational numbers sure! Let \(q \in \mathbb{Q}\) be a rational number and \(A \subseteq \mathbb{Q}\) be a set of rational numbers.  Then
\[ q \touches A \quad\text{if and only if}\quad \mathit{glb} \{ \lvert a - q \rvert \mid a \in A \} = 0. \]
Above \(\mathit{glb}\) is the greatest lower bound of a set, where a lower bound is any number that is less than (or equal to) everything in the set, and the greatest lower bound is...the greatest of the lower bounds.  I use \(\lvert e \rvert\) to refer to the absolute value of \(e\).

If we apply this definition to our earlier example, then we see that \(0 \touches D\), because \(0\) is less than every number \(\frac{1}{n}\), and if \(m\) is also a lowerbound of \(D\)  (say, for example, \(m = -3\)), then \(m \leq 0\).  So \(0\) is right up against \(D\), but is not an element of it.  It is "arbitrarily close" to \(D\).  That's what the greatest-lower-bound implies.  We can't say the same for \(-3\), though, because there are lots of numbers that are bigger than -3 (e.g., -2.9999999), but not in the set \(D\).

Notice that \(1 \in D\), and we'll say that it too is touching \(D\), but trivially, since it is contained.  This is a useful convention, and if need be, we can always talk about numbers that nontrivially touch a set.

Axioms of Touching

One of the neat things about topology is that it imbues sets with a notion of "space" without requiring a notion of distance between points.  For example, we can think of an undirected graph (i.e. a collection of nodes with edges between them) as having some spatial structure to it, even if we don't mean to take the lengths of the edges seriously, but just care about connectivity.  An online history of topology draws out this example in more detail.

The development of axioms for topology was a drawn out process, involving a number of mathematicians, but I'm going to skip all that and give an axiomatization of touching that it turns out is equivalent to what you would see in any introductory textbook.

To define a topological space, we assume some universe of discourse \( \U \), which is just some set, and then define touching as a relationship \(\touches\) between an element of the universe \(x \in \U\) and a subset of the universe \(A \subseteq \U.\) For succinctness and following common mathematical practice, each axiom implicitly quantifies all free set or point variables universally.

  1. \(x \ntouches {\emptyset} \)
  2. If \(x \in A\) then \(x \touches A\)
  3. If \(x \touches A\) and \(A  = B \cup C\) then \(x \touches B\) or \(x \touches C\)
  4. If \(x \touches A\) and \(a \touches B\) for all \(a \in A\), then \(x \touches B\)

Let's examine each of these axioms in turn:

  1. Funny things happen when you consider empty sets (\(\emptyset\) is the symbol for it).  But it does seem somewhat sensible to say that no element ever touches the empty set, because there's nothing inside of it to justify a notion of being close by.  I could almost imagine an alternative setup where every point touches the empty set, but that would play very poorly with axiom 4:  the two together would imply that every point touches every set!
  2. As alluded to earlier, you can think of set membership as a trivial instance of touching.  This notion also plays well with axiom 4:  if an element \(x\) touches a set \(A\), and then you add \(x\) to \(A\), then \(x\) also touches \(A \cup \{x\}\).  I'll come back to that again below.
  3. This one is a little funky, and we'll come back to it later.  It says roughly that if an element is touching a set and you break that set into two pieces, then the element has to touch at least one of those pieces.  To get a better sense of this, think of the various ways that you can break the set \(D\) above into two sets and convince yourself that \(0\) still touches one of those pieces.  You might consider repeating this process with the resulting piece that still touches, and see that it will still touch one of the new pieces! A thought to ponder:  since \(D\) is an infinite set, then splitting it into two will yield at least one infinite subset. The infinite subsets will still touch \(0\).  Can you roughly explain why?  
  4. The last axiom determines when touching is "transitive":  if \(x\) is arbitrarily close to a set \(A\) and every single element of \(A\) is arbitrarily close to a set \(B\), then it seems reasonable to assert that \(x\) is arbitrarily close to \(B\) as well, right?  Two arbitrarily-closes seem to make another arbitrarily-close!  To help ground this in a simple setting, let's observe that this axiom combined with axiom 2 imply the following (which I almost wish were its own axiom):
    If \(x \touches A\) and \(A \subseteq B\) then \(x \touches B \).  This is so because \(A \subseteq B\) means that \(a \in B\) for every element of A, and since \(a \in B\) implies \(a \touches B\)...voila!  This is an example of where treating set membership as a trivial case of touching seems to play out quite cleanly. It means that, at the least, touching is preserved by adding more stuff to the set.  This makes sense to me at least:  if I am touching your house, and you move extra furniture in through the back door, then I'm still touching your house!  

Seeing these for the first time, it's not totally clear that these axioms are all you need, but more than a century of investigation by a number of brilliant mathematicians seems to have led to this consensus.  Topological structures were defined by a social process of experimenting and coming up with something that is quite general and broadly applicable.  However topology is not typically presented this way, but in a different equivalent way, which I'll get to below.

Let's take a second to examine axiom 3 a bit more.  It says that I can split a set into two and still preserve touching.  If I apply it again, then I can split a set into three or four and still preserve touching (of one or more subset).  In fact, you can repeat this any finite number of times and preserve touching, because you are always allowed to apply an axiom a finite number of times in a proof (that's just how proofs work!).  But, and this is important, it does not let you split a set into an infinite number of subsets and be sure to preserve touching.  That could fail.

To see this in practice, suppose that we break the set \(D\) into a family of singleton sets \(\{\frac{1}{n}\}.\)  Then \(0 \ntouches \{\frac{1}{n}\}\) for any value of n.  That is, 0 does not touch any non-zero rational number (treated as a singleton set).  However, any finite splitting of \(D\) would be okay.   So this restriction to finite splitting matches at least our particular example, which means that it ought to be a universal principle if we are to apply the theory widely.

Continuous Functions

This is the part that sold me on thinking about topology as a theory of touching!  If you've seen any abstract algebra, group theory, what have you, then you've seen structure-preserving functions as the heart of their study.  In topology, the key functions are the continuous functions.  You may have come across them in studying calculus or some other related subject, but when given a function from real numbers to real numbers (or some bounded interval of them), we think of drawing a line graph without having to ever pick up our pen from the graph paper.  There are no "jumps" in the graph.  We can think of this using our notion of space and closeness and touchingness:   on a real number line, we tend to think more specifically of points "touching" the point right next to them, though there really is no such thing as "next point".  To fix this perspective we can instead go with the idea of a point touching sets of points, rather than a single point, as we axiomatized above!  For example, if \(r\) is a real number (i.e., in the set \(\mathbb{R}\)), then  \(r \touches \{s \in \mathbb{R} | s > r\}\) and \(r \touches \{s \in \mathbb{R} | s < r\}\), and so many other sets, using the analogous definition of touching for numbers as in our analysis of Zeno's Paradox.  So how does the idea of "drawing a line without picking up your pen" transfer into this world?  Well, if we think of lifting the pen as a "failure of touching", then  a continuous function will have the property that if we move left-to-right along the x-axis, so always moving among numbers that touch, then the y-axis motion will also move among numbers that touch!  Let's formalize this.

A function \(f : \mathbb{R} \to \mathbb{R}\) is continuous if and only if \(r \touches A\) implies 
\(f(r) \touches \{f(a) \mid a \in A\}\).  For succinctness, we will use the abbreviation \(f(A)\) for \(\{f(a) \mid a \in A\}\)

Let's consider a few example functions on reals, and talk through them.

\(f(n) = 7\).  This is easily judged continuous, because given any number \(n\), and any set \(A\) that \(n\) touches,  \(f(n) = 7\), and  \(f(A) = \{7\}\)  (\(f(A) \neq \emptyset\) since \(A\) cannot be empty if it touches \(n\), by axiom 1).  In a way, \(7 \touches \{7\}\) tells the whole story!

\(f(n) = n\).  This one is also simple, because \(f(n) = n \touches A = f(A)\).  So making no change whatsoever is continuous.

Here might be a good time to give a warning.  On both sides of the function \(f\), we are assuming the same notion of touching.  This need not be the case:  there are many possible notions of touching that can be assigned to a set.  Each combination of a set like \(\mathbb{R}\) and a definition of touching for it \(\delta\) produces a different topological space.  Since we tend to associate numbers with their natural ordering as numbers, we also typically assume that the topology we care about is related to that ordering.  To tickle your brain, though, here's another topology that we could associate with the real numbers:  define \(n \touches A\) if and only if \(n \in A\).  This definition satisfies the axioms above, so it counts as a notion of touching, but it equates touching with set membership!  This topology, which can be applied to any set, is called the discrete topology, and it's basically the most boring one.  Now a question for you:  if we use the discrete topology for the result space of functions on real numbers, then which functions are continuous?  Hmm...

\(f(n) = 2n\).  Here we are "stretching" the real line to be "twice as long" (take that explanation with a grain of salt, it's just meant to provide intuition).  Yet still the function is continuous.  I'll let you work out the details to see why.

\(f(n) = \lvert n \rvert\).  This too is continuous!  This one is interesting to me, because you can think of it as the two-dimensional version of folding a piece of paper in half:  all of the negative numbers get "folded up" to be positive numbers, while \(0\) gets to be itself.  What's interesting about it is that many of the numbers get moved, but they do so "in lock-step" such that anything that touches before absolute value still touches afterwards.  If you draw this as a graph on graph paper, you work your way down in a straight line, but then make an abrupt turn at \(0\).  Nonetheless, you don't need to pick up your pen from the paper, so it's still continuous.   Let's consider a specific case of touching here:  Let \(A = \{n \mid -1 \leq n \lt 0\} \cup \{n \mid 0 \lt n \leq 1\}\).  We can show that \(0 \touches A\), using our earlier definition of touching for numbers and sets of numbers.  But notice as well that \(0 \touches f(A) = \{n \mid 0 \lt n \leq 1\}\).  So the original set got "squished in half", but the resulting set still touches \(0\) nontrivially (i.e., without containing it).

\(f(n) = \begin{cases} 0 & n \leq 0 \\ 1 & n \gt 0. \end{cases}\)
This is our first non-continuous function.  If you drew the graph you would see the problem:  it is a lot like the constant function, but it changes constants at \(0\)!  Formally, \(0 \touches \{n \mid n \gt 0\}\) but \(f(0) = 0 \ntouches \{1\} = f(\{n \mid n \gt 0\}\).  Since the function fails to preserve touching in at least one case, it is not continuous (continuous functions always preserve touching).  The output got "ripped apart" at \(0\).

The rough intuition, then, is that continuous functions can "stretch" and even "fold" their input topological space when mapping it into the output space, but cannot "rip" the space at all.  Careful though:  looking at this using real numbers can preserve our geometric intuitions, but some topological spaces have peculiar notions of space, where notions like "stretch" don't really make sense.  However "ripping" typically does:  it just means violating the preservation of touching.

Some Topological Concepts

So now that we have a notion of a topology (a "universe" set along with a notion of touching) and the idea of continuous (i.e. touch-preserving) functions from one topological space to another, we can lay out some common properties of sets within a single topological space.  This can help us see how intuitions from geometry can be described purely in terms of elements touching sets. Kind of cool!

boundary of a set

When thinking geometrically, we are pretty comfortable with the idea of the boundary of a shape:  it's all stuff that sits on the border between the inside and outside of the shape.  You might scratch your navel a bit and wonder whether the boundary must be part of the shape or part of the surrounding environment though:  many border disputes between neighbors essentially boil down to this.

Given our notion of touching though, we can define a natural notion of boundaries for sets!  Suppose \(\mathcal{U}\) is our universe of discourse (so the pair \(\langle \mathcal{U},\delta \rangle\) define a topological space).  Then if \(A\) is some arbitrary subset of \(\mathcal{U}\), then the boundary of \(A\) is the set \[\mathit{boundary}(A) = \{x \in \mathcal{U} \mid x \touches A \;\text{and}\; x \touches (\mathcal{U} \setdiff A)\}.\] Here,  \(\mathcal{U} \setdiff A\) refers to the set difference between \(\mathcal{U}\) and \(A\), i.e. the set of all elements of \(\mathcal{U}\) that are not also elements of \(A\).  

So the boundary of a set is the set of all elements that simultaneously touch the set itself as well as the surroundings.  Here's a few simple examples in the real numbers.\[ \begin{align*} & \;\mathit{boundary}(\{n \mid 5 \lt n \le 6\}) \\ =& \; \mathit{boundary}(\{n \mid 5 \le n \lt 6\}) \\ =&  \;\mathit{boundary}(\{n \mid 5 \lt n \lt 6\}) \\= & \; \mathit{boundary}(\{n \mid 5 \le n \le 6\})  \\ = & \; \{5,6\}.\end{align*} \]

As we can see above, the notion of boundary is in general agnostic about which side the boundary points belong to, but we can still make that distinction using set operations, e.g. the boundary points of \(A\) that are in \(A\) are \(\mathit{boundary}(A) \cap A\).

interior of a set

In direct contrast to the boundary elements of a set (which may or may not be members of the set), we may want to consider the elements that are in the set and very much NOT on the boundary.  These are called the interior points of a set.  \[\mathit{interior}(A) = \{a \in A \mid \text{if}\; a \touches X \;\text{then}\; X \cap A \neq \emptyset \}.\]
Thus the interior of \(A\) consists of those points of A that can only be touched by sets that are already at least partly "in" A.  Such points can only be touched by "crossing the boundary" and entering the set.  Thus, we can also characterize it as the set minus it's boundary,  \[\mathit{interior}(A) = A \setdiff \mathit{boundary}(A).\] Well, that seems like the right idea to me!

closure of a set

The closure of a set is just the set plus it's boundary, \[\mathit{closure}(A) = A \cup \mathit{boundary}(A).\]  You can think of this as adding to the set everything that touches it, and thereby "closing" out the great beyond.  Another characterization, which is quite useful to think about, is that the closure of a set \(A\) is just the set of all points that it touches \[\mathit{closure}(A) = \{x \mid x \touches A\}.\]  I'll leave it to you to show that these are equivalent.

To better understand the relationship between boundaries, interiors, and closures, it's useful to observe and understand these additional related facts:\[\begin{gather*} \mathit{closure}(A) = \mathit{interior}(A) \cup \mathit{boundary}(A) \\ \mathit{interior}(A) = \mathit{closure}(A) \setdiff \mathit{boundary}(A) \\\mathit{interior}(A) \cap \mathit{boundary}(A) = \emptyset. \end{gather*}\]

isolated point in a set

The concept of an isolated point captures a member of a set that is basically nowhere near the rest of the set: \(x\) is an isolated point of \(A\) if \(x \in A\) but \(x \ntouches (A \setdiff \{x\})\).

limit point of a set

A related concept to that of an isolated point is that of a limit point: \(x\) is a limit point of \(A\) if  \(x \touches (A \setdiff \{x\})\).  Contrary to an isolated point,  a limit point can be a member of the set or not, whereas an isolated point must be a member.  So you can divide any set into its isolated points and the limit points that it contains (though there may also be limit points that it does not contain).

open set

An open set is a set that is its own interior, i.e. \(\mathit{interior}(A) = A.\)  Somehow, every point in the set is in the interior: there are no points on the boundary!  An example of such a set is our earlier example \(\{n \mid 5 \lt n \lt 6\}\), which conveniently enough is called an open range \((5,6)\)!  Consider any point in that range, say \(5.1\) for example.  Well, \(5.01\) is closer to the border than \(5.1\), and \(5.001\) is even closed than that, and ... you get the point...the lower boundary of this set \(5\) sits outside it.  Same for the upper boundary \(6\).

closed set

In contrast, a closed set is one that is its own closure, i.e. \(\mathit{closure}(A) = A.\)  Our earlier example of a closed range \(\{n \mid 5 \le n \le 6\}\), also often written \([5,6],\) is an example of (perhaps fairer to say "the inspiration for the concept of") a closed set.  Be warned though:  the notion of closed set and open set are not opposites!  In some topological spaces there exist sets that are both open and closed at the same time (they are often called clopen because...why not?!?).  But in our typical geometrically-motivated spaces like the real numbers, they are different things.

Topology Via Open Sets

Typically, the notion of topological space is given in terms of a universe and a set of subsets of the space, which are taken to be the open sets.  These open sets must include the empty set, the universe, be closed under arbitrary union, and closed under finite intersection.  Simple enough definition, but I personally find it hard to grasp this intuitively.  Then a continuous function is roughly any function where an open set in the result space maps back to an open set in the source space.  I'm not going into much detail here because you can find this story in most topology textbooks.

It turns out that both definitions of topological space, in terms of touching and in terms of open sets, end up being equivalent.  All the definitions can be mapped back and forth.  You can even give axioms for topological spaces in terms of closed sets, or in terms of the closure operator.  So why do we use open sets?  Let me give a historical answer and a pragmatic answer.  According to Moore's fascinating history of the concepts involved, we can blame Bourbaki for opting not to show all of the variations, but instead focusing on the open set formulation.  Now, pragmatically speaking, it turns out that it's pretty straightforward to build up a definition of a topological space in terms of open sets by using the concepts of basis and sub-basis, which I won't define here, but seem nice to work with. I imagine that defining "touching" for a particular topological space, may be a harder place to start from (though mind you, you can always translate the notion of open sets to a notion of touching!).  Since I'm not an expert on these topics, you should take this pragmatic reason with a grain of salt, it's mostly my speculating.


This post is inspired by an answer from mathoverflow as well as its reference to this textbook.  I was further inspired to lay this out after a conversation with Norman Ramsey and David A. Turner, who were interested to know that you could reconsider topology this way.

Thanks for corrections from Jeremy Siek, Ben Greenman, and Marianna Rapoport!

A Proof of Proof by Infinite Descent