I’m probably the wrong person to listen to when it comes to anything about that extension of abstract algebra we call category theory. I’ve been trying to learn about it for literally decades, and much of that time has involved an uncomfortable feeling that something fundamental about the material is just bouncing off of me. Contrast this with many of my colleagues who find great comfort in conceiving of mathematical concepts in the terminology of this area, and feel unsure whether they understand some concept before it has been rephrased in category-theoretic terms. Unfortunately these explanations (which scroll past me on social media and group conversations and Q&A at seminars, and...and...) don’t seem to be at the right level or angle to scaffold me up further into the topic. I experience this regularly and in some ways I am intensely jealous. Something about this doesn’t click for me: many things don’t click.
On the other hand, this feeling of being lost has been my friend and companion for many years about many topics, some of which I feel far more comfortable with now, so I continue to seek the right metaphor, the right story, that resonates with me and helps me continue to make forward progress, and teach concepts in a way that seems to work for others besides myself. Hence the story you are about to embark on.
I understand enough to know that category theory is useful as an organizing principle, and I must admit that those aspects of it that have breached my blood-brain barrier have been useful for conceptualizing in the abstract. When explaining mathematical concepts, you will not see me drawing commuting diagrams to explain concepts (except when they are literally rudimentary category-theoretic concepts), but you will see me verbally or formulaically appeal to categorical concepts like “universal properties”, albeit in more concrete terms than that. So I persist, years later, in making progress, and perhaps some of my recent sense of progress will help someone else.
Given the above, take the below with the appropriate grains of the appropriate salts!
My Problem
During one of my bouts of “let’s try to learn some category theory again”, I was poring over Eilenberg and Mac Lane’s General Theory Of Natural Equivalences, the classic paper that makes the case for natural equivalence as an important and hitherto invisible concept. Along the way, the paper introduces natural transformations, which are the basis for defining natural equivalences.
The paper starts with an observation about vector spaces and constructions on vector spaces: schematic methods for producing one kind of vector space from another. They observe some peculiar and compelling relationships between not the spaces themselves, but rather the constructions themselves.
A discussion of the “simultaneous” or “natural” character of the isomorphism
clearly involves a simultaneous consideration of all spaces L and all transformations connecting them; this entails a simultaneous consideration of the conjugate spaces and the induced transformations connecting them [emphasis mine]. 1
The relationship is captured in the form of schematic relationships that appeal to algebraic structure-preserving functions among the constructed spaces. I was unfamiliar with the concepts, but I could at least see how the authors recognized an interesting patterns of relationship among kinds of vector spaces, how in certain circumstances those relationships were highly constrained, forced to satisfy a kind of uniformity across pairs of vector spaces related to one another by this notion of “constructions”. They argue that such constrained relationships should be recognized, and show that fundamental to this recognition is how they can be represented using the language of vector space algebra, albeit clumsily and somewhat implicitly. This relationship goes beyond vector spaces: it arises among many varieties of mathematical concepts. But it seems to be this specific application that motivated the idea of “naturality”, whose formalization called for the definition of functors which in turn required the definition of categories. 2 As I understand it, capturing naturality was the focus, and the rest were necessary components. I have seen Mac Lane rumoured to have said as much, but have yet to track down the source.
So imagine my frustration that naturality, the most important of these concepts, is the one I understand the least. What is being transformed? How exactly is that transformation “natural”? I can see, by example, that there is a pattern to the relationships that are represented by natural transformations and natural equivalences, and by the continued effectiveness of its use that there is something worth articulating here. But I have never felt a visceral grasp for naturality.
Often this naturality is explained as...well natural (pun intended), and that you need to see many examples to get a feel for it. In programming languages theory, my research area, many folks will say to me that “natural transformations are like parametric polymorphic functions.” Indeed, parametric polymorphic functions (don’t worry if you don’t know what these are) are an important concept in PL, and they can be modeled as natural transformations. So this is an important exemplar. But this is still like telling me “A tree is like a Douglas Fir”: the example is the cue to understand the concept, but in an essential way this feels backwards and unfortunate. Indeed many concepts from nature yield messy taxonomies that can only be understood in terms of history and the web of instances, but we’re talking about pure mathematics here!
Another common effort to help explain natural transformations relates them to the concept of homotopy in algebraic topology. When I sat in on Steve Awodey’s (awesome) category theory class in my post-doc days, he too followed the introduction of the standard definition with the homotopy-looking characterization. Having previously learned about homotopies, I found this definition somewhat more comforting than the original, and working through its implications was quite satisfying (and heavily influenced what follows). I could get a sense of “continuous deformation-ishness” from this characterization, but I still could not see/feel the the full force of “simultanaeity” hidden in the definition.
I’ve wanted a story that I could find more compelling, and couched in terms of the abstraction itself, rather than PL theory or topology. Alongside good examples, I hope for this to help me better understand and explain naturality, rather than simply get used to it.
I think I finally found such a story, one that tickles my happy place. But it forced me to do some violence to the common presentation of categories, functors, and natural transformations. The result is an equivalent formalization of the concepts that better suits my present mental model. This kind of move is rather audacious, and quite frankly may lead to dead ends or problematic views that ultimately cause more conceptual harm than clarity. On the other hand, another perspective may draw out new or different connections, so I offer it for your chewing enjoyment.
So what follows is my own current personal answer to the title: “what is so natural about natural transformations?”. The following will be quite abstract, because it focuses simply on the abstract definitions and abstract properties of category-theoretic concepts. Perhaps in revision I’ll add examples, because that may help explain how I connect the concrete problems addressed by this abstraction to the abstraction itself. So this won’t make much sense if you don’t already have some background in the elementary basics of categories.
What’s a Category?
Despite the importance of natural transformations, the order in which the concepts are taught tends to follow the dependency order: categories then functors then natural transformations. The early concepts are needed to define the later ones. So let’s start with a definition of “category” and go from there. As you’ll see, my definition of a category is probably not yours, but it’s equivalent. Furthermore, at its core, it’s not actually mine: Saunders Mac Lane gives it as a brief aside in his famous book Category Theory for the Working Mathematician. But you’ll see that I’ve changed the terminology and notation for my own personal story-telling purposes: I find that the nuances of informal words used formally heavily influences my thinking about formalities (weak Sapir-Whorf anyone?), so rethinking the words we use for concepts helps me internalize them. 3 I apologize for the confusion this will inevitably cause to some readers. Just remember: my reason for going into a different definition of categories from the norm is that it leads quite directly to my answer to the question posed up-front. In fact, following this thread is how I came to discover this answer. Ok, buckle up!
Informally speaking, a category is:
a collection of “stuff”;
one binary relation among stuff; and
three operators on stuff.
I will call an item of stuff an element: a “real” category theorist would call them “arrows” or “morphisms”. I’m not using those names because, those words don’t resonate with me or my inner story about what categories are, though I can see how they served Eilenberg and Mac Lane’s original examples well. History has demonstrated how much more widely these concepts can be applied, so I’m going for that level of generality
Our one (binary) relation among elements is identity. Given
two elements, or more technically expressions that describe elements,
they are either identical, meaning they both describe the same
element or not, i.e.
Now for the operations. All three operations take elements as inputs
and (maybe) produce elements as output. The first two operations are
unary, and total. Associated with every element
Elements, identity, and left and right elements suffice to introduce
some of the properties required of
The third, and last, operation is binary and
partial. To every two elements
This raises an important (to me) conceptual-terminological point. I
have strongly mixed feelings about the use of the word “composite” here,
but I don’t see the possibility of a better word. In particular, the
composite element
My view of category theory is that it foregrounds a common and critical notion of compositional surrogative reasoning, in the sense that we reason about mathematical objects (or phenomena) by manipulating (textual) stand-ins for those actual objects. The first sense in which this is true is the difference between expressions in the language of category theory, and the elements of the category (the elements and the relationships represented by operators and relations) themselves. We are playing shadow puppets, using a highly expressive “API”, but don’t confuse the API with its implementation!
Thanks to the composition operator, one can reason about properties
of some notion (like the element
Despite everything I said about my not-real-composition perspective
on composition, I still find it useful to think of these properties in
real-composition terms. For instance, if we visually think of elements
as Lego blocks, laid on one side, then the left (right) element can be
viewed as describing the connection interface of the left (right) side
of an element. Then we can imagine hooking the right of
The first two properties completely determine the left and right
elements of composites:
We have more relationships that connect left and right elements to
composites. The names left and right come from the relationship between
these elements and composition, partly because of the above property,
but also because of the following ones:
Finally, the associations among elements that are described by
composition carry a strong coherence property, which can be described
succinctly, but I will spell it out in a way that better matches my
thinking. Suppose three elements,
All of the above makes a category! Those who are used to category theory have surely noticed a few peculiarities of this definition. First, there’s the different naming to spark different intuitions. Terms like “arrows” or “morphisms”, have a connotation of going from one place to another, or performing a transformation, whereas “elements” have a connotation of simply “being” and having no internal structure. Elements don’t “do” anything, or contain anything: they merely observe external relationships with other elements. Second, we’ve gotten rid of “objects” as classifiers of compositions. Unit elements can do that just fine, and lead to a very different feel.
As mentioned, Mac Lane notes this definition, but goes with the traditional one: 4
Since the objects of a metacategory correspond exactly to its identity arrows, it is technically possible to dispense altogether with the objects and deal only with the arrows (CfTWMp. 9).
Indeed objects give a nice connection to sets and arrows to functions in categories associated with set-theoretic structures. But for understanding category theory in a way that is removed from its set-facing origins, and the mental biases that it brings (especially for an unabashed fan of set theory) I like to suppress those associations, and I feel that objects are a cognitive distractor: what fundamentally matters is the relationships among elements, in light of the structure mandated by the properties of left and right unit elements and composition.
In my mind, category theory is fundamentally about abstract interfaces among abstract elements, and one ought not worry about the idea, or even existence, of truly composite structures (the implementations). Focus on external relationships as the essence of categorical structure, rather than the internal composition of entities via construction of “objects”: that’s what set theory is for!
What’s a Functor?
We have now re-imagined categories as bags of inter-related elements
with fundamental composition properties, described entirely in terms of
elements. Let’s move on to our first kind of relationship between
categories. Suppose we have two categories,
A functor
We demand more from our functors than totality or functionality: a
functor must also engage with the structure that the two categories
possess:
From a purely technical standpoint, I much prefer this definition of functors, to standard definition, which requires thinking about mappings of two different kinds of things, and some unpleasant interactions and notational oddities that ensue. If you know, you know!
What’s a Natural Transformation, and What’s So “Natural” About It?
Ok, we’ve got categories, and we’ve got functors: it’s time for the main event! So what is a natural transformation? As with the last sections, I am simply going to give an abstract definition, without concern about its utility, only that it be equivalent to the standard one. Here’s where things get different.
Suppose you have two categories,
By my definition, a natural transformation is a fundamentally
different kind of mapping from elements of
A natural transformation
Now here’s the punchline. Suppose
This property deserves some pondering to consider how wild and
constraining it is! Consider our element
The naturality of natural transformations are often
described according to a form of “simultaneous local coherence”
properties in terms of only the analogue of unit elements (so-called
“objects”). With some reasoning, we can show that a natural
transformation of the kind I describe above can be fully determined by
its treatment of unit elements, using a few “local” relationships about
rotating which unit element gets mapped by
Naturality touches every element of the source category, and every description of every element of the source category. I think the path equation in particular underscores the extent to which a natural transformation is like a self-supporting arch: all of the pieces must fit together like clockwork to support the property. There can exist multiple natural transformations subject to two functors, but each one manifests a category-wide equilibrium that, at least intuitively speaking, would be difficult to imagine perturbing. Arches are beautiful and fascinating naturally-occurring phenomena. Their balance defies our intuitions and inspires humans to recreate that beauty in their own inventions. The same seems to be true for natural transformations: their structure was at first discovered, rather than explicitly developed, but now they are devised with full awareness by informed minds.
A reasonable question to ask: is this global formulation of natural transformations useful? To be honest I don’t know. The original motivations for natural transformations (natural equivalence) were fundamentally based on abstracting constructions on sets, and this same facet manifests in discussions of parametric polymorphic functions. Perhaps parametric polymorphic functions can be reconceived as a different form of “liftings-plus-embeddings” applied to arbitrary functions, subject to two (functorial) type constructors, rather than as a “generic function” applied to individual types (i.e., sets of values, independent of all possible mappings from them). This conception might make parametric reasoning properties more explicit. But I haven’t given this much thought (yet, I hope). I can only speculate at this point.
Thanks to Yanze Li for typo fixes!
In this quote, “transformation” refers to structure-preserving mappings between vector spaces, natural transformations.↩︎
Well actually, Eilenberg and Mac Lane have a 1942 article that identifies the concept of naturality, where they focus on group theory. My bad!↩︎
Lucky for me I’m not alone. Here are the statisticians Bruno de Finetti and L. J. Savage on this matter: “Every progress in scientific thought involves struggle against the distortion of outlook generated by the habits of ordinary language, the deficiencies and captiousness of which must be denounced, combatted, and overcome, case by case.”↩︎
Freyd and Scedrov also begin their book Categories, Allegories with an analogous definition of categories, though on a cursory look, they seem to revert to a standard characterization shortly thereafter (thanks to Andrej Bauer for that information!).↩︎
No comments:
Post a Comment