## A Brief Introduction to Categories, Part 3: Natural Transformations and Equivalences

This post is a continuation of a mini-series on category theory, starting with this post. As before, knowing the background for every single example is not required for understanding the overall concepts.

## Introduction

Previously, we have defined categories and the morphisms between them. In the case that one has two parallel functors $F,G:\mathcal C \to \mathcal D$, between two categories $\mathcal C$ and $\mathcal D$, it is frequently the case that for any object $x \in \mathrm{Obj}(\mathcal C)$, one has a morphism $F(x) \to G(x)$. In this situation, the notion of a natural transformation conceptualizes what it means for this collection of morphisms to vary “naturally” with the object $x$, leading to a concept essential to all of category theory. Using this concept, we can define a coarser notion of equivalences than the obvious one of isomorphic categories.

## Natural Transformations

Definition C3.1 Let $\mathcal C$ and $\mathcal D$ be categories and let $F,G:\mathcal C \to \mathcal D$ be functors. Then a natural transformation $\eta:F \Rightarrow G$ consists of the following data:

• For every object $x \in \mathcal C$ a morphism in $\mathcal D$, $\eta_x:F(x) \to G(x)$ called the component of $\eta$ at $x$.

Such that the following condition holds:

• For every objects $x,y \in \mathcal C$ and every morphism $f:x \to y$, the following diagram commutes:
which means that $\eta_y \circ F(f)=G(f) \circ \eta_x$

One can think of the naturality condition as saying that $\eta$ “interpolates” between $F$ and $G$.

Definition C3.2 If all the components of a natural transformation $\eta:F \Rightarrow G$ are isomorphisms, $\eta$ is called a natural isomorphism and $F$ and $G$ are called naturally isomorphic.

Example C3.3 Let $G$ be a group and $X$ and $Y$ be $G$-sets, considered as functors $G \to \mathbf{Set}$. Then a natural transformation $X \Rightarrow Y$ has only one component, as $G$ has only one object, so it is given by a map of the underlying sets $f:X \to Y$ satisfying a naturality condition. This condition requires that we have for any $g \in G$ (i.e. for any morphism in $G$) $f \circ g = g \circ f$, (by abuse of notation, we write $g$ for its action on $X$ and $Y$), so that natural transformations correspond precisely to $G$-equivariant maps.

Example C3.4 Reasoning as in the last example, one concludes that if we have a group $G$ and a field $k$, then natural transformations between two representations, considered as functors $G \to k\mathrm{-Vect}$ correspond to morphisms of representations.

Example C3.5 Let $\mathbf{CRing}$ be the category of commutative and unital rings with ring homomorphisms and $\mathbf{Mon}$ be the category of monoids. Fix $n \in \mathbb N$. Let $F:\mathbf{CRing} \to \mathbf{Mon}$ be the functor that sends a commutative ring to the multiplicative monoid of the matrix ring: $F(R)=(M_n(R), \cdot)$. Every ring homomorphism $f: R\to S$ induces a monoid homomorphism $F(f): M_n(R) \to M_n(S)$ of multiplicative monoids by applying $f$ to every component (this is in fact a ring homomorphism of the matrix rings). Let $G:\mathbf{CRing} \to \mathbf{Mon}$ be the “forgetful” functor that forgets about addition and sends a commutative ring to its multiplicative monoid. Every ring homomorphism respects multiplication and the unit element, so that it induces a monoid homomorphism on the multiplicative monoids. The determinant is defined by a polynomial with coefficients in $\Bbb Z$. Hence it doesn’t matter if we apply the ring homomorphism $f:R \to S$ componentwise to a square matrix and then take the determinant or if we take the determinant first and then apply $f$ to the result. We hence get a commutative diagram:  stating that the determinant is natural in $R$.

## Natural Transformations as Homotopies

Reminder/Definition Consider two topological spaces $X$ and $Y$ and let $f:X \to Y$ and $g:X \to Y$ be continuous maps, then a homotopy $H$ between $f$ and $g$ is a continuous map $H:[0,1] \times X \to Y$ such that $H(0,x)=f(x)$ for all $x \in X$ and $H(1,x)=g(x)$ for all $x \in X$. If there exists such a homotopy, $f$ and $g$ are called homotopic, denoted by $f \simeq g$.

One thinks of a homotopy as a continuous deformation or an interpolation of $f$ to $g$ and of the first parameter from $[0,1]$ as a time parameter, as in the picture above. The definition states that $H$ behaves on $\{0\} \times X \cong X$ like $f$ and on $\{1\} \times Y \cong Y$ like $g$. Continuity implies that it is a continuous deformation of $f$ to $g$.

If we think of categories as the objects of study in category theory and functors as the morphisms between them, then natural transformations interpolate between  morphisms between the objects. This situation might be reminiscent of the situation in topology: objects in topology are topological spaces, the morphisms between them are continuous functions and homotopies interpolate between continuous functions. We will see in this section that this analogy can be made even closer than this vague similarity by defining the analogue of products and the unit interval for categories.

Definition C3.6 The arrow category or interval category $I$ is the following category: we have two objects $\mathrm{Obj}(I)=\{0,1\}$ and three morphisms: the two identities and a unique morphism $\iota:0 \to 1$. One easily observes that there is only one possible way to define composition which makes this into a category.

This category plays a role for natural transformations analogous to the unit interval

Definition C3.7 Let $\mathcal C$ and $\mathcal D$ be categories, then the product category $\mathcal{C} \times \mathcal{D}$ consists of the following data:

• $\mathrm{Obj}(\mathcal C \times \mathcal D)=\mathrm{Obj}(\mathcal C) \times \mathrm{Obj}(\mathcal D)$
• For $X,X' \in \mathcal C$ and $Y,Y' \in \mathcal D$, we have $\mathrm{Hom}_{\mathcal C \times \mathcal D}((X,Y),(X',Y'))=\mathrm{Hom}_{\mathcal C}(X,X') \times \mathrm{Hom}_{\mathcal D}(Y,Y')$
• Composition is defined componentwise.

Using these definitions, we can define the analogue of a homotopy: let $\mathcal C$ and $\mathcal D$ be categories and let $F:\mathcal C \to \mathcal D$ and $G:\mathcal C \to \mathcal D$ be functors. Then a “categorical homotopy” from $F$ to $G$ is a functor $H:I \times \mathcal C \to \mathcal D$ such that $H(0,x)=F(x)$ for all $x \in \mathrm{Obj}(\mathcal C)$ and $H(1,x)=G(x)$ for all $x \in \mathrm{Obj}(\mathcal C)$ and for all morphisms $f:x \to x'$ in $\mathcal C$, we have $H(\mathrm{id}_0,f)=F(f)$ and $H(\mathrm{id}_1,f)=G(f)$. This means that on the subcategory $0 \times \mathcal C \cong \mathcal C$, $H$ behaves like $F$ and on the subcategory $1 \times \mathcal C$, $H$ behaves like $G$.

Proposition C3.8 Let $\mathcal C,\mathcal D$ be categories and let $\mathcal F,G:\mathcal C \to \mathcal D$ be functors, then natural transformations $\eta:F \Rightarrow G$ correspond to categorical homotopies $H:I \times \mathcal C \to \mathcal D$ from $F$ to $G$.

Proof Let $\eta:F \Rightarrow G$ be a natural transformation, then we define a categorical homotopy $H_\eta: I \times \mathcal C \to \mathcal D$ as follows:

• For all $x \in \mathrm{Obj}(\mathcal C)$, let for $H_\eta(0,x)=F(x)$ and $H_\eta(1,x)=G(x)$.
• For all morphisms $f:x \to y$ in $\mathcal C$, let $H_{\eta}( \mathrm{id}_0,f)=F(f)$ and let $H_{\eta}(\mathrm{id}_1,f)=G(f)$ and let $H_{\eta}(\iota,f)=G(f) \circ \eta_x=\eta_y \circ F(f)$ (the last equality holds by naturality of $\eta$.)

Let us check that this is indeed a functor. It clearly respects identities. The only non-obvious cases for compatibility with composition involve $\iota$ in the first component. We have for morphisms $f:x \to y$ and $g:y \to z$ in $\mathcal C$,
$H_{\eta}((\iota,g) \circ (\mathrm{id}_0,f))=H_{\eta}(\iota,g \circ f)= G(g \circ f) \circ \eta_x= G(g) \circ G(f) \circ \eta_x=(G(g) \circ \eta_y) \circ F(f)=H_{\eta}(\iota,g) \circ H_{\eta}(\mathrm{id}_0,f)$
In the other case we have
$H_{\eta}((\mathrm{id}_1,g) \circ (\iota,f)))=H_{\eta}(\iota,g \circ f)=\eta_z \circ F(g \circ f)=\eta_z \circ F(g) \circ F(f)= G(g) \circ (\eta_y \circ F(f))=H_{\eta}(\mathrm{id}_1,g) \circ H_{\eta}(\iota,f)$
Such that $H_{\eta}:I \times \mathcal C \to \mathcal D$ is indeed a functor. It is clear from the construction that $H_{\eta}$ is a categorical homotopy from $F$ to $G$, completing one half of the correspondence. For the other direction, let $H$ be a categorial homotopy from $F$ to $G$. To obtain a natural transformation, set $\eta_x=H(\iota,\mathrm{id}_x)$ for all $x \in \mathrm{Obj}(x)$. Naturality now follows from functoriality of $H$,
as for any morphism $f:x \to y$ in $\mathcal C$, we have:
$G(f) \circ \eta_x = H(\mathrm{id}_1,f) \circ H(\iota,\mathrm{id}_x)=H((\mathrm{id}_1,f) \circ (\iota,\mathrm{id}_x)) = H(\iota,f)=H((\iota,\mathrm{id}_y) \circ (\mathrm{id}_0,f)=H(\iota,\mathrm{id}_y) \circ H(\mathrm{id}_0,f)=\eta_y \circ F(f)$
Using the same computations, one checks that those two constructions are inverse to each other.

## Equivalences of Categories and Related Properties

There’s an obvious notion of isomorphisms for categories: a functor $F:\mathcal C \to \mathcal D$ is an isomorphism if there is a functor $G:\mathcal D \to \mathrm C$ such that $G\circ F$ and $F \circ G$ are the identity functors on $\mathcal C$ and $\mathcal D$, respectively. For many purposes, however, this notion is too strict and a coarser notion is much more useful. We can let the analogy between natural transformations and homotopies guide us.

Reminder/Definition A continuous map $f:X \to Y$ between topological spaces $X$ and $Y$ is called a homotopy equivalence if there is a continuous map $g:Y \to X$ such that $g \circ f$ and $f \circ g$ are homotopic to the identities on $X$ and $Y$, respectively.

Example The $n$-dimensional sphere is homotopy equivalent to $\mathbb{R}^{n+1} \setminus \{0\}$.

This leads “naturally” to the following definition:

Definition C3.9 A functor $F:\mathcal C \to \mathcal D$ between two categories $\mathcal C$ and $\mathcal D$ is called an equivalence of categories if there is a functor $G:\mathcal D \to \mathcal C$ such that $G \circ F$ and $F \circ G$ are naturally isomorphic to the identity functor on $\mathcal C$ and on $\mathcal D$, respectively.

It is sometimes easier to verify some sufficient (and necessary) conditions for a functor to be an equivalence than to use the definition directly by constructing a functor in the other direction. We shall now define those properties, which are important in their own right.

Definitions C3.10 A functor $F:\mathcal C \to \mathcal D$ between two categories $\mathcal C$ and $\mathcal D$ is called

• faithful, if for all $x,y \in \mathrm{Obj}(\mathcal C)$, the induced map $\mathrm{Hom}_{\mathcal C}(x,y) \to \mathrm{Hom}_{\mathcal D}(F(x),F(y))$ is injective.
• full, if for all $x,y \in \mathrm{Obj}(\mathcal C)$, the induced map $\mathrm{Hom}_{\mathcal C}(x,y) \to \mathrm{Hom}_{\mathcal D}(F(x),F(y))$ is surjective.
• fully faithful, if for all $x,y \in \mathrm{Obj}(\mathcal C)$, the induced map $\mathrm{Hom}_{\mathcal C}(x,y) \to \mathrm{Hom}_{\mathcal D}(F(x),F(y))$ is bijective.

We state and prove a useful property of fully faithful functors:

Lemma C3.11 Let $F:\mathcal C \to \mathcal D$ be a fully faithful functor, then $F$ reflects isomorphisms in the following sense: if $x,y \in \mathrm{Obj}(\mathcal C)$ are two objects such that $F(x)$ and $F(y)$ are isomorphic in $\mathcal D$, then $x$ and $y$ are isomorphic in $\mathcal C$.

Proof Take an isomorphism $f':F(x) \to F(y)$ with inverse $g':F(y) \to F(x)$, then there are some $f:x \to y, g:y \to x$ such that $F(f)=f'$ and $F(g)=g'$ as the induced maps on $\mathrm{Hom}$-sets is surjective. We get $F(f \circ g)=F(f) \circ F(g)=f' \circ g'=\mathrm{id}_{F(y)}=F(\mathrm{id}_y)$, so $f \circ g = \mathrm{id}_y$ as $F$ is faithful. By symmetry, $g \circ f = \mathrm{id}_x$, so $f$ is an isomorphism with inverse $g$.

Lemma C3.12 A natural isomorphism of functors preserves fullness, faithfulness and full faithfulness

Proof Let $F,G: \mathcal C \to \mathcal D$ be functors and let $\eta:F \Rightarrow G$ be a natural isomorphism. One immediately sees that for any objects $x,y \in \mathrm{Obj}(\mathcal C)$ the map $\phi:\mathrm{Hom}_{\mathcal D}(F(x),F(y)) \to \mathrm{Hom}_{\mathcal D}(G(x),G(y)), \phi(h)=\eta_y \circ h \circ \eta_x^{-1}$ is a bijection with inverse given by $h \mapsto \eta_y^{-1} \circ h \circ \eta_x$.
By naturality, we have for any morphism $f:x \to y$ in $\mathcal C$ $G(f) \circ \eta_x = \eta_y \circ F(f)$, which implies that $G(f)= \eta_y \circ F(f) \circ \eta_x^{-1}=\phi(F(f))$ From this equation, we conclude that the maps induced by $F$ and $G$, respectively, on Hom-sets are related by the bijection $\phi$, hence one is injective, surjective or bijective if and only if the other one is.

Lemma C3.13 If $F:\mathcal C \to \mathcal D$ and $G:\mathcal D \to \mathcal E$ are functors such that $G \circ F$ is faithful, then $F$ is faithful. If $G \circ F$ is full, then $G$ is full. If $G$ is fully faithful, then $G \circ F$ is full/faithful/fully faithful if and only $F$ is.

Proof These assertions follow immediately from the corresponding elementary statements about injective, surjectiv and bijective maps. (E.g. if $g \circ f$ is surjective, then $g$ is surjective.)

Corollary C3.14 Any equivalence of categories is fully faithful.

Proof Let $F:\mathcal C \to \mathcal D$ be a functor and $G:\mathcal D \to \mathcal C$ be a functor such that $G \circ F$ is naturally isomorphic to the identity functor on $\mathcal C$ and $F \circ G$ is naturally isomorphic to the identity functor on $\mathcal D$. By using lemma C3.12, we can conclude that $G \circ F$ and $F \circ G$ are fully faithful, as the identity functor is evidently fully faithful. In virtue of lemma C3.13, this implies that $F$ is both full and faithful, hence fully faithful.

Definition C3.15 A functor $F:\mathcal C \to \mathcal D$ is called essentially surjective if for every object $x \in \mathrm{Obj}(\mathcal D)$, there is an object $y \in \mathrm{Obj}(\mathcal C)$ such that $F(y)$ is isomorphic to $x$.

Lemma C3.16 An equivalence of categories is essentially surjective.

Proof Let $F$ be a functor $\mathcal C \to \mathcal D$ and $G$ be a  functor $\mathcal D \to \mathcal C$ such that $F \circ G$ is naturally isomorphic via $\eta$ to the identity functor on $\mathcal D$. Then for any object $x \in \mathrm{Obj}(\mathcal D)$, we have an isomorphism $\eta_x:F(G(x)) \to x$, showing that $F$ is essentially surjective.

At this point, we have collected enough necessary properties of equivalences of categories to obtain a characterization that doesn’t rely on the existence of another functor, which consitutes the main result of this post.

Theorem C3.17 A functor is an equivalence of categories if and only if it is fully faithful and essentially surjective.

Proof Lemma C3.16 and corollary C3.14 furnish one half of the proof.
For the other half, let $F:\mathcal C \to \mathcal D$ be a fully faithful and essentially surjective functor. For any object $x \in \mathrm{Obj}(\mathcal D)$, choose an object $G(x) \in \mathrm{Obj}(C)$ such that $F(G(x))$ is isomorphic to $x$ (which is possible, as $F$ is essentially surjective.) Choose an isomorphism $\eta_x:x \to F(G(x))$. Let $f:x \to y$ be a morphism in $\mathcal D$. Then $\eta_y \circ f \circ \eta_x^{-1}$ is a morphism $F(G(x)) \to F(G(y))$. By full faithfulness of $f$, there is a unique morphism $G(f):G(x) \to G(y)$ such that $F(G(f))=\eta_y \circ f \circ \eta_x^{-1}$.
We need to check that these assignments make $G$ into a functor: let $f:x \to y$ and $g:y \to z$ be morphisms in $\mathcal D$, then from the definition of $G$, we obtain $F(G(g \circ f))=\eta_z \circ g \circ f \circ \eta_x^{-1}=(\eta_z \circ g \circ \eta_y^{-1}) \circ (\eta_y \circ f \circ \eta_x^{-1})=F(G(g)) \circ F(G(f))=F(G(g) \circ G(f)$, such that by faithfulness of $F$, $G(g \circ f)=G(g) \circ G(f)$. As for identities, we get for any object $x \in \mathrm{Obj}(\mathcal D)$, $F(G(\mathrm{id}_x))=\eta_x \circ \mathrm{id}_x \circ \eta_x^{-1}=\mathrm{id}_{F(G(x))}=F(\mathrm{id}_{G(x)})$ so that by faithfulness, $G(\mathrm{id}_x)=\mathrm{id}_{G(x)}$.
We now need to show that $G \circ F$ and $F \circ G$ are naturally isomorphic to the respective identities: The defining equation $F(G(f))=\eta_y \circ f \circ \eta_x^{-1}$ yields $F(G(f)) \circ \eta_x= \eta_y \circ f=\eta_y \circ \mathrm{Id}_{\mathcal D}(f)$ so that $\eta$ is a natural isomorphism from the identity functor on $\mathcal D$ to $F \circ G$.
$G(f)$.
To finish the proof, we have to show that $G \circ F$ is naturally isomorphic the identity functor on $\mathcal C$. Note that since $F \circ G$ is naturally isomorphic to the identity functor on $\mathcal D$ we get that $F(G(F(x)))$ is naturally isomorphic to $F$, because we have for any object $x \in \mathcal C$ an isomorphism $\eta_{F(x)}:F(x) \to F(G(F(x)))$ such that for any morphism $f:x \to y$ in $\mathcal C$, we have $F(f) \circ \eta_{F(x)}= \eta_{F(y)}\circ F(G(F(f)))$. Now as $F$ is fully faithful, there exist for each pair of objects $x,y \in \mathcal C$ a unique morphism $\vartheta_x: x \to G(F(x))$ such that $F(\vartheta)=\eta_{F(x)}$. To show that $\vartheta$ is natural, insert into $F(f) \circ \eta_{F(x)}= \eta_{F(y)}\circ F(G(F(f)))$ so that the definition of $\vartheta$, so that we get $F(f) \circ F(\theta_x)=F(\theta_y) \circ F(G(F(f)))$ which implies by functionariality $F(f \circ \theta_x)=F( \theta_y \circ \circ G(F(f)))$ so that by faithfulness, we get $f \circ \theta_x = \theta_y \circ G(F(f))$, which means precisely that $\theta$ is a natural transformation from the identity functor to $G \circ F$. As $F(\theta_x)$ is an isomorphism for each $x$ and $F$ is fully faithful, one concludes that $\theta_x$ is an isomorphism. (If $F$ is a fully faithful functor and $F(f)$ is an isomorphism, then $f$ is an isomorphism, cf. the proof of C3.11.) This shows that $G \circ F$ is naturally isomorphic to the identity functor on $\mathcal C$, which concludes the proof and also this post.

(Exercises and remarks for the stray logician or set theorist:
The above proof doesn’t work in ZFC, why? Give an example of an underlying set theory as a meta-theory such that the proof does indeed work.
Prove that the above proof does work in ZFC if one restricts the statement to small categories and that the statement for small categories is equivalent to choice over ZF.
In practice, set-theoretic issues are often ignored by those doing category theory, or one uses Tarski-Grothendieck set theory)