A Brief Introduction to Categories, Part 1: Categories

The purpose of this post is to establish some basic language of categories. This can be read independently of the ongoing series on representation theory, although concepts and results from this post will be used in the representation theory series. So with respect to the representation theory series, one can think of this post (and the follow-up posts) as the analog of an appendix.

The notions of category theory can be encountered in many different mathematical contexts. To illustrate the concepts, a range of examples will be given, but I will go quickly over standard examples – For common examples in greater detail and multitude, see the references below. It is not necessary to know the background of every single example to grasp the concepts.

To readers who wish to read a more detailed introduction, I recommend Tom Leinster’s “Basic Category Theory”  (available as a free pdf, but also published in book form) or in case you know German Martin Brandenburg’s “Einführung in die Kategorientheorie”. If you simply cannot get enough, Francis Bourceux’ three-volume Handbook of Categorical Algebra is the right place to get lost in the realm of categories and functors.

The views expressed on category theory are naturally (no pun intended) my own highly subjective ones and others may differ, especially those with a categorical disapproval of categorical things.

Introduction

The meta-mathematical notions of category theory provide a unifying framework for a lot of different mathematical theories. (Merely having a framework, however, can’t always replace delving into the specifics of the particular objects that one is interested in.) We can not only translate statements into categorical language, but we can also use categorical language to make things precise that might otherwise be vague, e.g. what it means for two different kinds of mathematical structures to be “equivalent” or what it means for a map to be “natural”.

Thinking categorically, we can economize on routine verifications by doing them in the utmost generality. More importantly, category theory can expose formal similarities between seemingly disconnected phenomena. These analogies might only become apparent through the high level of abstraction category-theoretic concepts provide, allowing one to transfer intuition, techniques and concepts in ways not available otherwise, at least not as easily or as precisely-defined.

Peter Freyd wrote once that “Perhaps the purpose of categorical algebra is to show that which is trivial is trivially trivial.”
Peter May commented: “I prefer an update of that quote: ‘ Perhaps the purpose of categorical algebra is to show that which is formal is formally formal’. It is by now abundantly clear that mathematics can be formal without being trivial.”

Now then! After this informal introduction, let us delve into the formal details of why that which is formal is formally formal.

Categories

Definition C1.1 A category $\mathcal{C}$ consists of the following data:

• A class of objects $\mathrm{Obj}(\mathcal{C})$
• For every pair of objects $X,Y \in \mathrm{Obj}(\mathcal{C})$, a set of morphisms $\mathrm{Hom}_{\mathcal{C}}(X,Y)$
• For every triple of objects $X,Y,Z \in \mathrm{Obj}(\mathcal{C})$, a map $\mathrm{Hom}_{\mathcal{C}}(Y,Z) \times \mathrm{Hom}_{\mathcal{C}}(X,Y) \to \mathrm{Hom}_{\mathcal{C}}(X,Z)$ called composition and denoted by $(f,g) \mapsto f \circ g$

Such that the following conditions hold:

• For any object $X \in \mathrm{Obj}(\mathcal{C})$, there exists an identity $\mathrm{id}_X \in \mathrm{Hom}_{\mathcal C}(X,X)$ such that for all objects $Y,Z \in \mathrm{Obj}(\mathcal{C})$ and all morphisms $f \in \mathrm{Hom}_{\mathcal{C}}(Y,X), g \in \mathrm{Hom}_{\mathcal{C}}(X,Z)$ we have $\mathrm{id}_X \circ f= f$ and $g \circ \mathrm{id}_X = g$
• Composition is associative, that means for all objects $X,Y,Z,W \in \mathrm{Obj}(\mathcal{C})$ and $f \in \mathrm{Hom}_{\mathcal{C}}(X,Y), g \in \mathrm{Hom}_{\mathcal{C}}(Y,Z), h \in \mathrm{Hom}_{\mathcal{C}}(Z,W)$, we have $h \circ (g \circ f) = (h \circ g) \circ f$
• $\mathrm{Hom}_{\mathcal{C}}(X,Y) \cap \mathrm{Hom}_{\mathcal{C}}(X',Y') = \varnothing$ unless $X=X'$ and $Y=Y'$

Remark The last condition is just a technical one that is not very interesting and is mostly omitted from verifications that things are categories. You can always replace the Hom-sets by disjoint copies by set-theoretic juggling, so it’s not much of an issue. The advantage of this convention is that source and target of a morphism are uniquely determined.

One should think of a category as a big directed graph in which objects are represented by nodes and morphisms are represented by arrows and where we have a special operation called composition that takes two consecutive arrows and produces a third arrow representing going along one arrow first and then along the other one.
Compared with set theory, which can be called “materialistic”, in the sense that it is about what objects actually are made out of, category theory can be considered “behavioristic”, in the sense that it’s more about how objects behave, i.e. relate to each other. Relations between objects are conceptualized by the arrows also known as morphisms. But for morphisms, just like for the objects themselves, it’s more about how they relate each other and less about what they actually represent. The relations between morphisms are given by composition.
From a linguistic point of view, category theory is purely syntax, whereas semantics come into play only in the interpretation of particular examples.

Example C1.2 We can form the category of sets $\mathbf{Set}$, where $\mathrm{Obj}(\mathbf{Set})$ is the class of all sets and for $X,Y \in \mathrm{Obj}(\mathbf{Set})$, $\mathrm{Hom}_{\mathbf{Set}}(X,Y)$ is the set of all maps from $X$ to $Y$ and the composition of morphisms is just the usual composition of maps. Many other examples (but not all of them) can be formed as “subcategories” of this example in the sense that the objects are sets with additional structure and morphisms are mappings which preserve that structure in some way, such as topological spaces and continuous maps or algebraic structures of a specific type and homomorphisms or smooth manifolds and smooth maps or complex manifolds and holomorphic maps or varieties over an algebraically closed fields and regular maps etc.

Example C1.3 There is another category with the class of all sets as objects, but with more morphisms. Recall that a mapping is a particular kind of relation. $\mathbf{Rel}$ is the category consisting of all sets as objects, and for two objects $X,Y \in \mathrm{Obj}(\mathbf{Rel})$, $\mathrm{Hom}_{\mathbf{Rel}}(X,Y)$ is the set of all relations from $X$ to $Y$, i.e. subsets $R \subset X \times Y$. Given three sets $X,Y,Z$ and relations $R \subset X \times Y, S \subset Y \times Z$, we can form the composite relation $S \circ R := \{(x,z) \in X \times Z \mid \exists y \in Y: ((x,y) \in R \land (y,z) \in S) \}$. (If $R$ and $S$ are functions, this reduces to regular function composition.) With this composition, $\mathbf{Rel}$ satisfies the axioms for a category.

Example C1.4 In the previous examples for categories, the objects were given by the class of all sets. But there’s also the possibility to fix a particular set $X$ and consider $X$ as a category $\mathcal{C}$ in the following way: $\mathrm{Obj}(\mathcal{C})=X$ and the only morphisms are the identity morphisms required by the definition of categories. Due to the striking similarity with discrete topological spaces (compare for instance the set of morphisms $\mathrm{Hom}_{\mathcal{C}}(x,y)=\varnothing$ if $x \neq y$ and $\mathrm{Hom}_{\mathcal{C}}(x,y)=\mathrm{id}_x$ if $x=y$ with the discrete metric), this construction is called the discrete category on the set $X$.

Example C1.5 Instead of only allowing for the minimum amount of morphisms, we can take a set $X$ and consider such categories with $X$ as their class of objects that have at most one morphism between two objects. In such a case, the only interesting information is whether there is a morphism between two objects or not. Consequently, such a category corresponds to a relation $R \subset X \times X$. But the requirements for compositions and identities force a relation to have some properties if it is supposed to define a category in this manner: the existence of identity morphisms in the supposed category is equivalent to the relation being reflexive and the existence of compositions translates to transivity. A reflexive and transitive relation is called “preorder”. For instance, all equivalence relations and all partial orderings are preorders. An example of a preorder that is neither can be formed by taking a ring, say, an integral domain $A$ such as $\mathbb{Z}$ and considering the divisibility relation on elements of $A$. Here the failure to be a partial order is due to the existence of non-trivial units and the failure to be an equivalence relation is to due the existence of non-invertible elements.

Definition C1.6 Speaking of invertible things, just like for an element in a ring, we can define when a morphism in a category is invertible. Let $f:X \to Y$ a morphism in a category. Then $f$ is called invertible, or an isomorphism, if there is a morphism $g:Y \to X$ such that $g \circ f = \mathrm{id}_X$ and $f \circ g = \mathrm{id}_Y$. If there is such a morphism, then $X$ and $Y$ are called isomorphic. Note that the notations $f: X \to Y$ and $g \circ f$, albeit reminiscent of the situation in the category $\mathbf{Set}$, don’t necessarily refer to mappings or function composition, but to the particular morphisms and composition in an arbitrary category.

Example C1.7 Other than restricting the size of morphisms as in example C1.5, one can also restrict the size of the class of objects. For this, let’s consider categories with only one object. In general, if we consider composition as a function on all morphisms of a category, it is only partially defined, since source and target need to match up correctly. But this doesn’t arise if we have only one object. Let $\mathcal C$ be a category with one object $x$ and let $\mathrm{Hom}_{\mathcal C}(x,x)=M$. Then composition is a (totally defined) mapping $M \times M \to M$, i.e. a binary operation.
The existence of the identity $\mathrm{id}_x:=e$ implies that $e$ is a unit with respect to this binary operation and the associativity of the composition implies that the binary operation is, well, associative. Thus composition $\circ:M \times M \to M$ makes $M$ into a monoid.
Conversely, given a monoid $M$, we can turn $M$ into a one-object category by taking a proxy object $x$ with no particular meaning, setting $\mathrm{Hom}(x,x)=M$ and defining the composition via the monoid operation. These constructions are inverse to each other, such that monoids correspond to one-object categories.
A particular frequently occuring and much beloved class of monoids consists of groups, i.e. monoids in which every element is invertible. By a pleasant convergence of terminology, an element in a monoid being invertible (in the classical sense) is the same as being an invertible morphism if we consider the monoid as a one-object category. Thus we can say that a monoid is a group if and only if every morphism is an isomorphism. This leads to the following notion:

Groupoids

Definition C1.8 A groupoid is a category such that every morphism is an isomorphism.

If we think of groups as encoding the symmetries of one object, then groupoids can capture the symmetries of a configuration with many objects.

Example C1.9 Let $R$ be a preorder on a set $X$, then we can form a category from these data as described in example C1.8. One can ask when one obtains groupoid in this way. It turns out that the corresponding category is a groupoid if and only if $R$ is symmetric, i.e. an equivalence relation.

Remark From this observation we can extract an intuition for general groupoids: a groupoid is like a set with an equivalence relation, except that two objects can be equivalent in more than one way and we’re keeping track of different ways of being equivalent. From this viewpoint a group, being a one-object groupoid, is like the various ways a particular object is equivalent to itself, following the narrative that groups encode the symmetries of an object.

Example C1.10 Let $G$ be a group acting on a set $X$, then one can form a groupoid encoding the group action as follows: Let $X // G$ be the category having elements from $X$ as its objects and for every $x \in X$ and $g \in G$, there’s a morphism $x \to gx$, such that $\mathrm{Hom}_{X // G}(x,y)$ is in bijection with $\{g \in G:gx=y\}$. This groupoid played an important role in a previous post.

Example C1.11 Let $\mathcal C$ be any category. Then there is a maximal subcategory that is a groupoid: simply take the subcategory of $\mathcal C$ with the same objects as $\mathcal C$ and all the isomorphisms of $\mathcal C$ as morphisms. This is called the core of the category $\mathcal C$ and it is a groupoid. If $\mathcal C$ has only one object, i.e. is a monoid, then the core of $\mathcal C$ is just the group of invertible elements, also known as units.
As an exercise, one can check that the categories $\mathbf{Set}$ and $\mathbf{Rel}$ (cf. C1.2 and C1.3) have the same core, i.e. any invertible relation is already a function and in fact a bijection.

Example C1.12 Let $X$ be a topological space. Then the fundamental groupoid $\Pi_1(X)$ has as its objects all elements of $X$. For $x,y \in X$, $\mathrm{Hom}_{\Pi_{\leq 1}(X)}(x,y)$ consists of all homotopy classes of paths from $x$ to $y$. Composition is given by concatenation of paths. $\Pi_{\leq 1}(X)$ contains important homotopy-theoretic information about $X$, such as the path-components and the fundamental group of each path-component. For more information on this very geometric construction and groupoids in algebraic topology, I heartily recommend Ronald Brown’s “Topology and Groupoids”.

This concludes the first part of the mini-series on introductory categories. Next time we’ll look into functors.

An Introduction to Character Theory

This is a continuation of the ongoing series on representation theory, see Overview of blog posts for previous posts on that subject.

Let $G$ be a group and let $k$ be a field. We begin with some remarks on dual spaces and tensor products of representations.

Lemma/Definition 5.1 If $V$ is a $k[G]$-module, then the dual space $V^*=\mathrm{Hom}_k(V,k)$ is a $k[G]$-module via $g \cdot f(v) := f(g^{-1}v)$. This is called the dual representation.

Proof This can be checked via an explicit (but boring) computation, so let’s do a more conceptual proof (feel free to ignore it if you’re not into categories): The representation $V$ can be considered a functor $\rho_V:G \to k\textrm{-}\mathrm{Vect}$, where we regard $G$ as a one-object category. The inversion map $G \to G, g \mapsto g^{-1}$ is an anti-isomorphism, i.e. an isomorphism $G^{op} \to G$, as $(gh)^{-1}=h^{-1}g^{-1}$, which means that inversion $\iota$ gives a functor $\iota:G^{op} \to G$, now if we compose the functors $\mathrm{Hom}_k(-,k) \circ \rho \circ \iota$, we get a covariant functor $\rho_V:G \to k\textrm{-}\mathrm{Vect}$, since we composed two contravariant and one covariant functors, this is the dual representation defined in the lemma.

Lemma/Definition 5.2 If $V$ and $W$ are two $k[G]$-modules, then $V \otimes_k W$ is a $k[G]$-module via the “diagonal action” given on elementary tensors by $g\cdot (v \otimes w):=gv \otimes gw$.

Proof As before, this is a straightforward computation. For a more conceptual proof, one can use that the diagonal map $G \to G \times G, g \mapsto (g,g)$ induces a ring homomorphism $k[G] \to k[G\times G]\cong k[G] \otimes_k k[G]$ given on basis elements by $g \mapsto g \otimes g$ (this is called “comultiplication” in the setting of Hopf algebras) and that $V \otimes_k W$ is canonically a $k[G] \otimes_k k[G]$-module (where the first copy of $k[G]$ acts on $V$ and the other one on $W$), so one can restrict scalars along the comultiplication $k[G] \to k[G] \otimes_k k[G]$.

Lemma/Definition 5.3 If $V$ and $W$ are two $k[G]$-modules, then $\mathrm{Hom}_k(V,W)$ is a $k[G]$-module where $g \cdot f$ is defined via $(g \cdot f) (v)= gf(g^{-1}v)$

Proof this is a straightforward computation. One can also give a conceptual proof similar to those before using the maps from the two previous proofs.

Lemma 5.4 Let $V$ and $W$ be $k[G]$-modules, then we have $\mathrm{Hom}_k(V,W)^G=\mathrm{Hom}_{k[G]}(V,W)$.

Proof clear from the definition of the $k[G]$-module structure on $\mathrm{Hom}_k(V,W)$.

Lemma 5.5 Let $V$ and $W$ be two finite-dimensional $k[G]$-modules, then the map $V^* \otimes_k W \to \mathrm{Hom}_k(V,W)$ given on elementary tensors by $\xi\otimes w \mapsto (v \mapsto \xi(v)w)$ is an isomorphism of $k[G]$-modules.

Proof That this is an isomorphism of $k$-vector spaces is known from linear algebra. One checks that it is $G$-equivariant.

For the rest of this post, let $G$ be a finite group and let $k$ be a field of characteristic $0$. (All representations shall have coefficients in $k$)

By Maschke’s theorem, every representation of $G$ with coefficients in $k$ can be decomposed as a direct sum of irreducible representations. If $V$ is an irreducible representation and $W$ is a finite-dimensional representation, then we get by Schur’s lemma that the number of times that $V$ appears in the decomposition of $W$ is equal to $\mathrm{dim}_k\mathrm{Hom}_{k[G]}(V,W)/ \mathrm{dim}_k \mathrm{End}_{k[G]}(V)$ (cf. 3.19)
Let us revisit the averaging method from proof of Maschke’s theorem to find another expression for this dimension.

Lemma 5.6 Let $V$ be a $k[G]$-module, then the map $V\mapsto V^G$, $\mathrm{avg}:v \mapsto \frac{1}{|G|} \sum_{g \in G} gv$ is a linear projection.

Proof To see that $\mathrm{avg}$ is well-defined note that for $h \in G$, we get $h \frac{1}{|G|}\sum_{g \in G}gv=\frac{1}{|G|}\sum_{g \in G} hgv= \frac{1}{|G|}\sum_{g \in G} gv$, since the map $G \to G, g \mapsto hg$ is a bijection. That $\mathrm{avg}$ restricts to the identity on $V^G$ and linearity is clear.

Lemma 5.7 Let $(V,\rho_V)$ and $(W,\rho_W)$ be finite-dimensional, then $\mathrm{dim}_k \mathrm{Hom}_{k[G]}(V,W)=\frac{1}{|G|}\sum_{g \in G}\mathrm{Tr}(\rho_V(g^{-1})) \mathrm{Tr}(\rho_W(g))$

Proof Since the map $\mathrm{avg}:\mathrm{Hom}_k(V,W) \to \mathrm{Hom}_k(V,W)^G = \mathrm{Hom}_{k[G]}(V,W)$, $f \mapsto (v \mapsto \frac{1}{|G|}\sum_{g \in G}\rho_W(g)f(\rho_V(g^{-1})v))$ is a projection onto the subspace $\mathrm{Hom}_k(V,W)^G$ (by 5.6), we get that $\mathrm{dim}_k \mathrm{Hom}_{k[G]}(V,W) = \mathrm{Tr}(\mathrm{avg})$. Using the isomorphism $V^* \otimes_k W \cong \mathrm{Hom}_k(V,W)$ (by 5.5), we get that $\mathrm{Tr}(\mathrm{avg})=\frac{1}{|G|}\sum_{g\in G}\mathrm{Tr}(\rho_V^*(g) \otimes \rho_W(g))$. Using properties of traces and the definition of the dual representation, this equals $\frac{1}{|G|}\sum_{g\in G} \rho_V(g^{-1})\rho_W(g)$.

Let us ponder for a moment what we have shown so far. We know that every finite-dimensional representation $(W,\rho_W)$ of $G$ may be decomposed as a direct sum of irreducible submodules. For each irreducible representation $(V,\rho_V)$ of $G$, we can compute the multiplicity of $(V,\rho_V)$ in the decomposition of $W$ if we know $\mathrm{dim}_k \mathrm{Hom}_{k[G]}(V,W)$ and $\mathrm{dim}_k \mathrm{End}_{k[G]}(V)$ (cf. the discussion preceeding 5.6). But now 5.7 gives us an expression for these dimension that only involves the traces $\mathrm{Tr}(\rho_W(g))$ and $\mathrm{Tr}(\rho_V(g))$. It thus makes sense to give a special name to the function $G \to k, g \mapsto \mathrm{Tr}(\rho_W(g))$

Definition 5.8 Let $(V,\rho_V)$ be a finite-dimensional representation of $G$, then the function $G \to k, g \mapsto \mathrm{Tr}(\rho_V(g))$ is called the character of $(V,\rho_V)$ and is denoted by $\chi_V$. (Note that the trace of a matrix is invariant under conjugation, so the character doesn’t depend on a choice of basis for $V$.)

Using the above discussion, we obtain the following surprising corollary:

Corollary 5.9 A finite-dimensional representation of $G$ is uniquely determined by its character: if two finite-dimensional representations of $G$ have the same character, then they are isomorphic.

This concludes a short intro and motivation for character theory, we will continue the study in future posts.

Semisimplicity and Representations, Part 2

This entry is a direct continuation of this post and is part of a larger series on representation theory that starts with that post. In this blog post, we continue our study of semisimple rings and bring it to a conclusion by proving a classification. Then we apply this insight into the structure of semisimple algebras to the the group algebra to get more information on irreducible representations.

Endomorphism Rings of Semisimple Modules

The goal behind the following lemmas and corollaries is to compute the endomorphism ring of a semisimple module that is a finite direct sum of simple modules, the motivation being that to finite-dimensional representations in the semisimple case (cf. 3.20) and to a semisimple rings as a module over itself (cf. 3.24).
The idea behind the proof is simply to “multiply out” the hom set that gives us endomorphisms by pulling out (finite) direct sums from both arguments and then use Schur’s lemma to understand the hom sets between simple modules.

Definition 4.1 For a ring $R$ and a natural number $n$, $M_n(R)$ denotes the matrix ring with entries in $R$ with the usual formulas for addition and multiplication.

The following lemma is a generalization of a well-known fact from linear algebra:

Lemma 4.2 Let $S$ be a right module over a ring $R$, then $\mathrm{End}_R(S^n) \cong M_n(\mathrm{End}_R(S))$

Proof Lets index the direct summands in $S^n = \bigoplus_{i=1}^n S_i$ (so $S_i=S$ for all $i$, but we’re keeping track of which component it is)
Note that as abelian groups, the isomorphism is clear, since we have $\mathrm{End}_R(S^n) = \mathrm{Hom}_R( \bigoplus_{i=1}^n S_i, \bigoplus_{j=1}^n S_j) = \bigoplus_{i=1}^n\bigoplus_{i=j}^n \mathrm{Hom}_R(S_i,S_j)$
Now for each summand $\mathrm{Hom}_R(S_i,S_j)=\mathrm{End}_R(S)$ at least as an abelian group, which we can use to idenitfy elements in $\bigoplus_{i=1}^n\bigoplus_{i=j}^n \mathrm{Hom}_R(S_i,S_j)$ with elements in $M_n(\mathrm{End}_R(S)$ by sending the component living in $\mathrm{Hom}_(S_i,S_j)$ with the entry at the position $(i,j)$. The verification that this respects multiplication amounts to the same calculation that shows that composition of linear maps corresponds to matrix multiplication.

This gives a description of the endomorphism rings of finite direct sums of simple modules, where each simple submodule is in the same isomorphism class. To properly “multiply out” our endomorphism rings, we just need another application of Schur’s lemma:

Lemma 4.3 Let $M$ and $N$ be modules over a ring $R$ and suppose that $\mathrm{Hom}_R(M,N) = 0 = \mathrm{Hom}_R(N,M)$, then we have as rings $\mathrm{End}_R(M \oplus N) \cong \mathrm{End}_R(M) \times \mathrm{End}_R(N)$
If $R$ is a $K$-algebra for some field $K$, then this an isomorphism of $K$-algebras.

Proof We always have an injection $\mathrm{End}_R(M) \times \mathrm{End}_R(N) \to \mathrm{End}_R(M \oplus N)$ given by $(\varphi,\psi) \mapsto \varphi \oplus \psi$. Here $\varphi \oplus \psi$ means that we apply $\varphi$ on the first, and $\psi$ on the second summand. Since addition and composition of such endomorphisms can be carried out component-wise, this is a ring homomorphism and if everything is a $K$-vector space, it is also $K$-linear. The image of that map is the set of endomorphisms of $M \oplus N$ that map $M$ and $N$ to itself. Given our conditions, if we have any endomorphism of $M \oplus N$, restricting to $M$ and composing with the projection $M \oplus N \to N$, gives a homomorphism $M \to N$ which is zero by assumption. This shows that the endomorphism maps $M$ to itself, and by the same argument $N$, thus the map is surjective.

Corollary 4.4 Let $M$ be a right module over a ring $R$ that is a finite sum of simple submodules $M=\bigoplus_{i=1}^k S_i^{n_i}$ such that the $S_i$ are pairwise non-isomorphic. Let $D_i=\mathrm{End}_R(S_i)$ be the endomorphism rings, then $\mathrm{End}_R(M) \cong \prod_{i=1}^k M_{n_i}(D_i)$

Proof By Schur’s lemma, there are no nonzero homomorphisms between semisimple modules that don’t have a simple submodule in common, so that we can apply 4.3 $k$ times to get that $\mathrm{End}_R(M) \cong \prod_{i=1}^k \mathrm{End}_R(S_i^{n_i})$. Now apply 4.2 to each factor.

Using this, we get a partial converse to Schur’s lemma:

Corollary 4.5 Let $M$ be a module that is a finite sum of simple submodules, then $M$ is simple if and only if the endomorphism ring of $M$ is a division ring.

Proof One direction is just Schur’s lemma. The other direction follows from 4.4 by noting that the only way that $\prod_{i=1}^k M_{n_i}(D_i)$ is a division ring is if $k=1=n_1$.

Enter the Matrix Ring

Corollary 4.4 already gives us a description of endomorphism rings of modules which are finite sums of simple submodules. In the description, matrix rings over division algebras appear. This means to deepen our understanding of those endomorphism rings, we should study matrix rings.

We begin by studying their ideals:

Lemma 4.6 Let $R$ be a ring and $n \in \Bbb N$, then the map from two-sided ideals of $R$ to two-sided ideals of $M_n(R)$, sending $I$ to $M_n(I)$ is a bijection. (Here $M_n(I)$ is the subset of all elements in $M_n(R)$ where each entry is contained in $I$)

Proof It’s clear that $M_n(I)$ is a two-sided ideal in $M_n(R)$ when $I$ is a two-sided ideal in $I$. It’s also clear that the map $I \mapsto M_n(I)$ is injective (we can recover $I$ from $M_n(I)$ just by looking at the set of elements of $R$ that appear in the entries of the matrices in $M_n(I)$.
For surjectivity, let $E_{ij}$ be the matrix that is zero everwhere except at $(i,j)$, where it is $1$. Then if $J \subset M_n(R)$ is a twosided ideal and $A \in J$, we compute that $E_{11}AE_{11}$ is the matrix that agrees with $A$ at the place $(1,1)$ and is zero everywhere else. A calculation shows that the set of elements in $R$ that appear as an entry in the place $(1,1)$ is a two-sided ideal $I$ in $R$, but then $J\supset M_n(I)$, because we can permute the entries of a matrix that has only one non-zero entry by multiplying from the left and right by permutation matrices (and then by taking sums, we get that every element in $M_n(I)$ is contained in $J$.
On the other hand $J \subset M_n(I)$, because we can first multiply a matix from the left and right by permutating matrices and then multiply by $E_{11}$ from the left and right to get that we could have also defined $I$ as the set of elements which appear as any matrix entry for elements in $J$. Thus $J \subset M_n(I)$.

This lemma implies in particular that if $D$ is a division ring, then $M_n(D)$ has no proper non-zero two-sided ideals. We can use this to together with another property that holds in general for finite-dimensional modules:

Lemma 4.7 Let $M$ be a module over a ring $R$ and suppose that $R$ contains a division ring $D$ such that $M$ is a finite-dimensional $D$-vector space. (We don’t require that $R$ is a $D$-algebra, i.e. that $D$ is commutative and in the center of $R$), then $M$ contains a simple submodule.

Proof Start with any non-zero submodule of $M$. e.g. $M$ itself. If $M$ is not simple, choose a proper non-zero submodule of $M$, then if that submodule is not simple choose a proper non-zero submodule etc.
Since the dimension over $D$ has to decrease at every step, this process has to terminate, which gives us a simple submodule.

Lemma 4.8 Let $R$ be a ring that has no proper nonzero two-sided ideals that has a minimal right ideal $I$, then $R$ is semisimple and every simple module over $R$ is isomorphic to $I$

Proof Let $R$ be such a ring and let $I$ be a minimal right ideal, then we can make a two-sided ideal out of $I$ by taking the product $RI=\sum_{r \in R} rI$.
Since $R$ doesn’t contain proer nonzero two-sided ideals and $RI \neq 0$, we get that $R=RI=\sum_{r \in R} rI$. For each fixed $r$, $rI$ is a homomorphic image of the simple module $I$ under the image of the right $R$-linear map $x \mapsto rx$, so the image is either isomorphic to $I$ (and hence simple) or zero by Schur’s lemma.
The implication (2.) implies (3.) (compare the proof or the remark after the proof) in 3.14 gives us that there is a subset $T \subset R$ such that $R=\bigoplus_{r \in T} rI$. Thus $R$ is semisimple. 3.25 implies that every simple module is isomorphic to a direct summand in the sum $\bigoplus_{r \in T} rI$, but every summand of that is isomorphic to $rI$ which implies that every simple module is isomorphic to $I$.

A natural question is how to recover $R$ from $M_n(R)$. The following lemma provides this, if we also have the module $R^n$:

Lemma 4.9 Let $R$ be a ring and consider $R^n$ as the column space, which is a right $M_n(R)$-module, then we get that $\mathrm{End}_{M_n(R)}(R^n)=R$. If we think of $\mathrm{End}_{M_n(R)}(R^n)$ as a submodule of $\mathrm{End}_{R}(R^n)=M_n(R)$, then $\mathrm{End}_{M_n(R)}(R^n)$ is given by all scalar multiples of the identity.

Let $E_{i,j}$ be the matrix that has zeroes everywhere except a $1$ at $(i,j)$, let $e_i$ be the vector in $R^n$ that has zeroes everyhwere except a one in the $i$-th component. For any $\sigma \in S_n$ let $P_\sigma$ be the permutation matrix associated to $\sigma$ such that applying that matrix corresponds to permuting the entries with $\sigma$.
Then for any $\varphi \in \mathrm{End}_{M_n(R)}(R^n)$, we get that $\varphi(e_1)=\varphi(e_1 E_{1,1})=\varphi(e_1)E_{1,1}$. Now multiplying with $E_{1,1}$ corresponds to making all entries zero except the first entry which is left untouched. Thus $\varphi(e_1)$ is a multiple of $e_i$, so we can find a unique $\lambda \in R$ such that $\varphi(e_1)=\lambda e_1$.
For every $i$ let $\tau_{1,i} \in S_n$ be the transposition that switches $1$ and $i$, then we have $\varphi(e_i)=\varphi(e_1 P_{\tau_{1,i}})=\varphi(e_1) P_{\tau_{1,i}} = \lambda e_1 P_{\tau_{1,i}} = \lambda e_i$.
Since the $e_i$ form a basis, we have seen that $\varphi$ is just scalar multiplication (from the left) with $\lambda$, or as a matrix $\lambda \cdot \mathrm{Id}$, where $\mathrm{Id}$ denotes the identity matrix.
This proves $\mathrm{End}_{M_n(R)}(R^n)=R$.

Putting the last few lemmas together, we obtain the main result on modules over matrix rings over a division algebra:

Lemma 4.10 Let $D$ be a division algebra and $n \in \mathbb{N}$, then $M_n(D)$ is semismple and the unique simple right module over $M_n(D)$ is $D^n$ and $\mathrm{End}_{M_n(D)}(D^n) = D$. (Here we’re regarding $D^n$ as row vectors so that the right $M_n(D)$-action makes sense.)

Proof 4.6 implies that $M_n(D)$ has no nonzero proper two-sided ideals and 4.7 implies that it has a simple submodule, so that 4.8 applies and $M_n(D)$ is semisimple with only one isomorphism class of simple modules.
The statement that $D^n$ is a simple $M_n(D)$-module is just linear algebra:
Given any non-zero vector $v$ in $D^n$, for all $w \in D^n$, there’s a linear transformations (i.e. a matrix) that sends $v$ to $w$. Thus $D^n$ is a simple module over $M_n(D)$ as every non-zero submodule is the whole module. The part about the endomorphism ring follows from 4.9.

Now we come to the main theorem about semisimple rings that classifies them and also relates their ring-theoretic properties to their modules.

Theorem 4.11 (Artin-Wedderburn) A ring $R$ is a semisimple iff there exist natural numbers $k, n_1, \dots, n_k$ and division rings $D_1, \dots D_k$ such that $\displaystyle R \cong \prod_{i=1}^k M_{n_i}(D_i)$.
Here the $k$ is the number of simple modules $M_1, \dots M_k$ up to isomorphism, $D_i$ is the endomorphism ring of $M_i$ and $\mathrm{dim}_{D_i}(M_i)=n_i$. In particular, $k$ is uniquely determined and the $n_i$ and $D_i$ are unique up to isomorphism and permutation of the factors.

Proof Note that for every ring $R$, considered as a right module over itself, we have an isomorphism of rings $R \cong \mathrm{End}_R(R)$ by applying 4.9 with $n=1$. Let $R=\bigoplus_{i=1}^k M_i^{n_1}$ by a decomposition of $R$ into simple right submodules such that the $M_i$ are pairwise non-isomorphic. Let $D_i=\mathrm{End}_R(M_i)$.
By 4.4, we have an isomorphism $\mathrm{End}_R(R) \cong \prod_{i=1}^k M_{n_i}(D_i)$. Here $k$ is uniquely determined as number of simple modules over $R$ up to isomorphism. The $D_i$ together with the $n_i$ are uniquely determined as the endomorphism rings of the simple modules and the dimension of the simple modules over their endomorphism ring (cf. lemma 4.10 for modules over matrix rings over a division algebra and 2.5 for modules over a product).
For the reverse direction, note that matrix rings over division rings are semisimple by 4.10 and 3.15 implies that finite products of semisimple rings are semisimple.

After having finally proved the main theorem for semisimple rings, we can give lots of applications:

Corollary 4.12 Semisimplicity for rings is left-right symmetric, i.e. if $R$ is symmetric, so it is opposite ring $R^{op}$.

Proof Using 4.11, it’s enough to show that $M_n(D)^{op}\cong M_n(D^{op})$ since the opposite ring of a division ring will be a division rings as well.
An isomorphism $M_n(D)^{op} \to M_n(D^{op})$ is given by transposition $A \mapsto A^{T}$.

Corollary 4.13 Let $A$ be a semisimple ring, then the following are equivalent:

1. $R$ is commutative
2. For all simple $R$-modules $M$, the endomorphism ring $D=\mathrm{End}_R(D)$ is a field and $\mathrm{dim}_D(M)=1$.

Proof We apply 4.11: $\prod_{i=1}^k M_{n_i}(D_i)$ is commutative iff all $n_i$ are $1$ and all $D_i$ are commutative, i.e. fields.

Corollary 4.14 Let $G$ be an abelian group, then every irreducible real representation of $G$ is at most 2-dimensional.

Proof Let $G$ be an abelian finite group. Since $\mathbb{C}$ is the algebraic closure of $\mathbb{R}$ and has dimension 2 over $\mathbb{R}$, we get that all finite field extensions of $\mathbb{R}$ have dimension at most 2. By 4.13 all simple $\Bbb{R}[G]$-modules are one-dimensional over their endomorphism rings, which are finite field extensions of $\mathbb{R}$ by 3.27 and 4.13, so the endomorphism rings are at most two-dimensional, thus all simple $\Bbb{R}[G]$-modules have dimension at most 2 over $\mathbb{R}$.

Corollary 4.15 Let $G$ be a finite group and $K$ be an algebraically closed field such that the characteristic of $K$ doesn’t divide the order of $G$, then the following are equivalent:

1. $G$ is abelian
2. All irreducible representations of $G$ are one-dimensional
3. The number of irreducible representations is equal to the order of $G$

Proof The equivalence of (1) and (2) follows from 4.13 and 3.27 and 3.29, since 3.27 and 3.29 together imply that the endomorphism ring of any simple $K[G]$-module is $K$, so the statement follows from 4.13.
The equivalence of (2) and (3) follows from 3.30

Reminder For a group $G$, the abelianization $G^{ab}$ is the largest commutative quotient of $G$. Explicitly, it is given by the quotient of the subgroup generated by all commutators. It has the universal property that every morphism from $G$ to an abelian group factors uniquely through the map $G \to G^{ab}$.

Corollary 4.16 Let $G$ be a finite group and $K$ be an algebraically closed field such that the characteristic of $K$ doesn’t divide the order of $G$, then the number of one-dimensional representations of $G$ up to isomorphism is equal to the order of the abelianization $|G^{ab}|$

Proof One-dimensional representations are homomorphisms $G \to \mathrm{GL}_1(K) = K^\times$.
Since $K^\times$ is commutative, these correspond to homomorphisms $G^{ab} \to \mathrm{GL}_1(K)$, i.e. one-dimensional representations of $G^{ab}$. One-dimensional representations are automatically irreducible and 4.15 tells us that since $G^{ab}$ is abelian, there are exactly $|G^{ab}|$ up to isomorphism.

The Center of the Group Algebra

The last corollary gives a partially answers the question for the number of irreducible representations of a group, by characterizing the number of one-dimensional representations in terms of the group theory of $G$. In this section, we give a group-theoretic characterization of the number of all irreducible representations. (Given suitable assumptions on the field)

Definition 4.17 If $R$ is a ring, then the center $Z(R)$ is defined as $Z(R)=\{z \in R \mid \forall r \in R: zr=rz \}$. It is a commutative subring of $R$.

Lemma 4.18 If $R$ is a ring and $n \in \mathbb{N}$, then $Z(M_n(R))$ consits of those diagonal matrices where all diagonal entries are the same value that lies in $Z(R)$. So we have an isomorphism of rings $Z(R) \cong Z(M_n(R))$.

Proof We rergard $R^n$ as row vectors which is naturally a right $M_n(R)$-module. Then for $z \in Z(M_n(R))$, the map $R^n \to R^n, v \mapsto vz$ is $M_n(R)$-linear, because for $M \in M_n(R)$, we have $vMz=vzM$.
By 4.9 $z$ is a diagonal matrix where all diagonal entries are the same. Clearly the diagonal entry must lie in $Z(R)$, as $z$ commutes with other such diagonal matrices.

Lemma 4.19 If $R$ and $S$ are rings, then $Z(R \times S)=Z(R) \times Z(S)$.

Proof This is an easy computation. Multiplication in $R \times S$ is defined component-wise.

Corollary 4.20 If $R$ is a semisimple ring with the Artin-Wedderburn decomposition $R \cong \prod_{i=1}^k M_{n_i}(D_i)$, then $Z(R) \cong \prod_{i=1}^k Z(D_i)$.

Corollary 4.21 If $A$ is a semisimple algebra over a field $K$, then $\mathrm{dim}_K(Z(A)) \geq k$ where $k$ is the number of simple $A$-modules up to isomorphism. We have equality if $K$ is algebraically closed.

Proof By 4.11, the number of simple $A$-modules up to isomorphism is $k$ if $A= \prod_{i=1}^k M_{n_i}(D_i)$. By 4.20, we have $Z(R) \cong \prod_{i=1}^k Z(D_i)$. Each factor has least dimension $1$, which shows the inequality.
If $K$ is algebraically closed, then $D_i=K$ for all $i$ by 3.29 which shows that equality holds.

Corollary 4.22 If $A$ is semisimple real algebra and $k$ is the number of simple $A$-modules up to isomorphism, then $2k \geq \mathrm{dim}_{\mathbb{R}}(Z(A)) \geq k$

Proof Argue as in the proof of 2.21 and use the fact that $Z(D)$ where $D$ is a finite-dimensional division algebra over $\mathbb{R}$ has to be isomorphic to $\mathbb{R}$ or $\mathbb{C}$, since it is a finite field extension of $\mathbb{R}$, so the dimension is at most two.

These corollaries relate the number of simple modules of a semisimple algebra and their center. Thus the next thing to do is to study the center of group algebras.

Reminder If $G$ is a group, then for $g \in G$, the conjugacy class of $g$ consists of all elements in $G$ that are conjugate to $g$. In other words, if we consider the action $G \times G \to G (h,g) \mapsto hgh^{-1}$, then this is the orbit of $g$ under this action. The different conjugacy classes are the minimal subsets of $G$ that are closed under conjugation and form a partition of $G$.

Lemma 4.23 Let $K$ be a field and let $G$ be a group (we don’t need it to be finite). Then $Z(K[G])$ has a $K$-basis given by the set of elements of the form $\sum_{g \in C} g$ where $C$ varies over all conjugacy classes of $G$ with finitely many elements.

Proof Since $K[G]$ is a $K$-algebra with a $K$-basis given by $G$, we can check if an elements is in the center just by checking if it commutes with every element of $G$. Let $x=\sum_{g \in G} \lambda_g g$ be an element of $Z(K[G])$ (so all but finitely many $\lambda_g$ are zero. Then $x$ is in the center of $K[G]$ if and only if for all $h \in G$, we have $xh=hx \Leftrightarrow x=h^{-1}xh$.
Writing out $x$, this equation becomes $\sum_{g \in G} \lambda_g g = h^{-1}(\sum_{g \in G} \lambda_g g)h$ $=\sum_{g \in G} \lambda_g h^{-1}gh=\sum_{g \in G} \lambda_{hgh^{-1}} g$. Here we used that $g \mapsto hgh^{-1}$ is a bijection.
Comparing coefficients, this is equivalent to $\lambda_g = \lambda_{hgh^{-1}}$ for all $g$.
If this holds for all $g$, then we get that the (finite) set of elements $X$ $g$ such that $\lambda_g \neq 0$ is closed under conjugation and if two elements inside $X$ are conjugate, their coefficients are equal. Take a system of representatives for $X/G$ where $G$ acts on $X$ by conjugation. Then by the above considerations, we get that $x=\sum_{g \in G} \lambda_g g=\sum_{g \in X} \lambda_g g= \sum_{g \in X/G} \lambda_g \sum_{h \in C(g)} h$, where $C(G)$ is the conjugacy class of $g$. This shows that the set of elements of the form $\sum_{g \in C} g$ where $C$ is a finite conjugacy class is a generating system for $Z(K[G])$. It is linearly independent because distinct conjugacy classes are distinct.

Theorem 4.24 If $G$ is a finite group and $K$ is a field such that the characteristic of $K$ doesn’t divide the order of $G$ and let $k$ be the number of irreduible representations, up to isomorphism. Let $c$ be the number of conjugacy classes of $G$. Then $k \leq c$ with equality if $K$ is algebraically closed. If $K=\mathbb{R}$, then we have $k \leq c \leq 2k$.

Proof By 4.23 $c$ is the dimension of the center of $K[G]$. Now apply 4.21 and 4.22.

Thus we have obtained a relation between the number of irreducible representations and the element structure of $G$ by heavily employing the structure theory for semisimple rings. This is a nice illustration of the power of ring-theoretic tools in representation theory.

Semisimplicity and Representations, Part 1

This post is the third one in a series on representation theory. The previous posts are this one and that one (in this order.) The nature of this post is mostly ring-theoretic, but we will give applications to representation theory throughout the development of the general theory.

Semisimple Modules

Under suitable assumption on $G$ and $K$, Maschke’s theorem (1.22) tells us that any submodule of a $K[G]$-module is a direct summand, i.e. we can find a complement.
One can try to apply this repeatedly to decompose a $K[G]$-module into smaller submodules. If the dimension is finite, then at some point we have to end up with a direct sum of modules that don’t have a non-zero proper submodule. This is because if one direct summand had a non-zero proper submodule, we could just decompose it further by Maschke’s theorem. The assumption of finite dimension implies that this process has to terminate, as the dimension of the summands decreases every time we decompose something.
This motivates the following definition to give a name to the modules we obtained as summands in the end:

Definition 3.1 A non-zero module over a ring is called simple if it doesn’t have a proper non-zero submodule.

Example 3.2 If $K$ is a field, or more generally a division ring, then a vector space over $K$ is simple iff it is one-dimensional.

Example 3.3 If we consider modules over $\mathbb{Z}$, i.e. abelian groups, then simple modules are just simple abelian groups. It’s known that simple abelian groups are the groups that are cyclic of prime order: $\Bbb{Z}/p\Bbb{Z}$.

Example 3.4 If $K$ is a field, and we consider $K[X]$-modules, i.e. $K$-vector spaces equipped with a choice of endomorphism $A$ (cf. the first section of the last entry), then a module is simple iff it doesn’t have a non-zero proper $A$-invariant subspace. One can show that this equivalent to being isomorphic to $K[X]/(f)$ for some irreducible $f \in K[X]$. In particular, if $K$ is algebraically closed, then simple $K[X]$-modules are precisely the one-dimensional ones. (Where the endomorphism necessarily acts by scalar multiplication.) This means that over an algebraically closed field, an endomorphism of a finite-dimensional vector space is diagonalizable, if and only if the associated $K[X]$-module is a direct sum of simple modules. (And in general, the associated $K[X]$-module is a direct sum of simple modules if and only if the the endomorphism is diagonalizable over an algebraic closure.) We will encounter the condition of being a direct sum of simple modules later in this post.

Generalizing the last two examples, we have the following result:

Lemma 3.5  If $R$ is any ring and $\mathfrak{m}$ is a maximal left ideal, then $R/\mathfrak{m}$ is a simple module. Conversely, every simple module is of that form

Proof If $\mathfrak{m}$ is a maximal left ideal, then submodules of $R/\mathfrak{m}$ correspond to submodules of $R$ containing $\mathfrak{m}$, so $R/\mathfrak{m}$ is simple by definition of $\mathfrak{m}$ being maximal.
Conversely, if $M$ is a simple module and $m \in M$ is nonzero, then $Rm$ is a non-zero submodule of $M$, so $M=Rm$. This means the map $R \to M, r \mapsto rm$ is surjective so we get an isomorphism $R/I \cong M$ for some proper ideal $I$. If $I$ is not maximal, then there’s a proper submodule containing $I$, which corresponds to a proper non-zero submodule of $M$, which is impossible.

Example/Definition 3.6 If $K$ is a field and $G$ is a group, then similar to example 3.4, simple $K[G]$-modules are representations with no non-zero proper $G$-invariant subspace. These are called irreducible representations.

Example 3.7 The representations of a cyclic group of order $n$, corresponding to irreducible factors of $X^n-1$ that we have constructed in 2.6. are irreducible. The reason is that they’re irreducible $K[X]$-modules, where the action of $X$ corresponds to the action of a generator of the group. (cf. 3.4 and the proof of 2.6)

We have seen in 3.5 that simple modules are generated by one element, let’s give this property a label (generalizing the notion of cyclic groups):

Definition 3.8 Modules that are generated by a single element are called cyclic modules.

By our considerations in the beginning of the section, we see that when Maschke’s theorem applies and we have a finite-dimensional representation, it decomposes as a direct sum of irreducible subrepresentations. The purpose of the following lemmas is to generalize this. (Because we work without any finiteness conditions, we will need some form of the axiom of choice. If one is only interested in modules that satisfy some finiteness condition (e.g. finite-dimensional modules for an algebra over a field), then the dependence on choice can be eliminated and the arguments are much easier.)

Definition 3.9 A module $M$ over a ring is called semisimple if every submodule $N \leq M$ has a complement, i.e. there exists a submodule $N' \leq M$ such that $M=N\oplus N'$

Lemma 3.10 Submodules and quotients of a semisimple modules are semisimple.

Proof If $M$ is semisimple and $M/N$ is a quotient, then for submodule $\overline{K} \leq M/N$, we can take the preimage under the projection $M \to M/N$ to get a submodule $K \leq M$ that projects onto $\overline{K}$. Then the image under the projection of a complement of $K$ will be a complement for $\overline{K}$. If $N \leq M$ is a submodule, then we can find a complement $N'$, but then $M/N' \cong N$ so that $N$ is quotient of $M$, so the previous case applies.

We will need the following result for an important property of semisimple modules. Most readers will probably be familiar with this, at least in the commutative case:

Lemma 3.11 Let $R$ be a ring. Then every proper left ideal is contained in a maximal left ideal.

Proof Let $I$ be a proper left ideal and let $\mathcal{P}$ be the set of all proper left ideals containing $I$. If we have an ascending chain $(I_{i})_{i \in \Omega}$, where $I_i \in \mathcal{P}$, then $\displaystyle \cup_{i \in \Omega} I_i$ is an upper bound. This is a proper ideal, because if it wasn’t, some $I_i$ would contain $1$, which is impossible. So Zorn’s lemma applies and we get a maximal element in $\mathcal{P}$

Corollary 3.12 Every non-zero cyclic module contains a maximal submodule

Proof Any non-zero cyclic module is of the form $R/I$ where $I$ is a proper left ideal. Now apply 3.11 to $I$.

Lemma 3.13 Any non-zero semisimple module contains a simple submodule.

Proof Let $M$ be a semisimple module over a ring $R$. As by 3.10, submodules of semisimple modules are semisimple, it suffices to treat the case where $M$ is cyclic. In that case, $M$ contains a maximal submodule $N \leq M$ by 3.12.
As $M$ is semisimple, we can find a submodule $S \leq M$ such that $M = N \oplus S$.
If $S$ is not simple, then there is a non-zero proper submodule $S' \subsetneq S$, but then $N \subsetneq N \oplus S' \subsetneq N \oplus S = M$, which contradicts the maximality of $N$.

We now come to the main result on semisimple modules, the proof is a little technical.
The most important part of the statement for us is the implication (1)=>(2) (cf. 3.16), but we give the full result for completeness.

Proposition 3.14 For a module $M$, the following statements are equivalent:

1. $M$ is semisimple
2. $M$ is a sum of simple submodules
3. $M$ is a direct sum of simple submodules

Proof
(1.) implies (2.): Let $M$ be semisimple and let $\mathrm{soc}(M)$ be the sum of all simple submodules, then as $M$ is semisimple, we get that $M=\mathrm{soc}(M) \oplus N$ for some $N \leq M$.
If $N$ is non-zero, we get that $N$ contains a simple submodule by 3.10 and 3.13, but this contradicts the definition of $\mathrm{soc}(M)$ and the fact that $\mathrm{soc}(M) \cap N = 0$.

(2.) implies (3.): Suppose $M = \sum_{i \in I} M_i$ where all $M_i$ are simple.
Consider the set of subsets $J \leq I$ such that $\sum_{i \in J}M_i = \bigoplus_{i \in J}M_i$. This is partially ordered by inclusion and the usual Zorn’s lemma argument works (just take unions of chains as upper bounds) so that we get a maximal element $J_{\omega}$. Suppose that $\bigoplus_{i \in J_\omega} M_i = \sum_{i \in J_\omega} M_i \subsetneq \sum_{i \in I} M_i = M$, then for some $i_0 \in I$, we get that $M_{i_0} \not \subset \bigoplus_{i \in J_\omega} M_i$, which implies that $M_{i_0} \cap \oplus_{i \in J_\omega} M_i = 0$, since $M_{i_0}$ is simple and that intersection is a proper submodule. But then we get
$M_{i_0} + \oplus_{i \in J_\omega} M_i = M_{i_0} \oplus \oplus_{i \in J_\omega} M_i$ which contradicts the maximality of $J_\omega$.

(3.) implies (1.): Let $M = \bigoplus_{i\in I} M_i$ with all $M_i$ simple. Let $N \leq M$ be a submodule. We may assume that $N$ is a proper submodule.
Consider the set of subsets $J \subset I$ such that $N \cap \bigoplus_{i \in J} M_i = 0$. This is non-empty, as $N$ is a proper submodule, it doesn’t contain some $M_i$, but then $N \cap M_i = 0$, as $M_i$ is simple.
Now we apply (surprisingly!) Zorn’s lemma to this set, partially ordered by inclusion by taking unions as upper bounds for chains. Let $J_\omega$ be a maximal element.
Then consider $N+\bigoplus_{i \in J_\omega} M_i = N \oplus \bigoplus_{i \in J_\omega} M_i$. If this is a proper submodule of $M$, then it must have zero intersection with some $M_{i_o}$ for $i_0 \in I$. It follows that $N \cap M_{i_0} = 0$, $i_0 \not \in J_\omega$ and $M_{i_0} \cap M_i = 0$ for all $i \in J_\omega$, thus $M_{i_0} + \bigoplus_{i \in J_\omega}M_i= M_{i_0} \oplus \bigoplus_{i \in J_\omega} M_i$, so that by maximality of $J_\omega$, we get $N \cap (M_{i_0} \oplus \bigoplus_{i \in J_\omega} M_i) \neq 0$, so we can choose $n$ non-zero in that intersection. Write $n=m+m'$ for some $m \in M_{i_0}$ and $m' \in \bigoplus_{i \in J_\omega} M_i$
Then $m=n-m'$ is contained in $M_{i_0} \cap (N \oplus \bigoplus_{i \in J_\omega} M_i)$ which is zero by the choice of $i_0$.
Thus $n=m'$ is a non-zero element of $N \cap \bigoplus_{i \in J_\omega} M_i = 0$ which is impossible, thus $M=N \oplus \bigoplus_{i \in J_\omega} M_i$.

Note that the proof for the implication from (2) to (3) actually shows that if a module is a sum of simple submodules, one can find a subset of the index set such that the sum is direct and still gives the whole module.

Corollary 3.15 Direct sums of semisimple modules are semisimple.

Proof Use the equivalence between (1) and (3) in 3.14

Corollary 3.16 If $G$ is a finite group and $K$ is a field such that the characteristic of $K$ doesn’t divide the order of $G$, then any representation of $G$ over $K$ is a direct sum of irreducible subrepresentations.

Proof Follows from 1.22 and 3.14

We have already seen an instance of this phenomenon in our study of cyclic groups in the semisimple case. (cf. 2.6 and 3.7)

After this not-quite-simple proposition about semisimple modules, we return to simple properties of simple modules.

Let’s first record an observation, so that the statements we’re about to prove make sense:

Lemma 3.17 Let $R$ be any ring and let $M$ and $N$ be modules over $R$, then $\mathrm{End}_R(M)$ and $\mathrm{End}_R(N)$ are rings and if $R$ is a $K$-algebra, they are also $K$-algebras.
$\mathrm{Hom}_R(M,N)$ is a left module over $\mathrm{End}_R(M)$ and a right module over $\mathrm{End}_R(N)$ and these actions are compatible, i.e. $\mathrm{Hom}_R(M,N)$ is a $(\mathrm{End}_R(M),\mathrm{End}_R(N))$-bimodule.

Proof The statement might look complicated, but all we’re doing here is just composing maps: $\mathrm{End}_R(M)$ is a ring (or $K$-algebra) under composition of maps and the module structures on $\mathrm{Hom}_R(M,N)$ are given by composing with endomorphisms from the left or the right. All properties we need follow from properties of composing linear maps

Lemma 3.18 (Schur) Let $M$ and $N$ be modules over a ring $R$ and let $f:M \to N$ be a linear map, then:

1. If $M$ is simple, then $f$ is either zero or injective.
2. If $N$ is simple, then $f$ is either zero or surjective.
3. If both $M$ and $N$ are simple, then $f$ is either zero or an isomorphism.
4. If $M$ is simple, then $\mathrm{End}_R(M)$ is a division ring. (cf. 3.17)

Proof
(1): As $M$ is simple, $\mathrm{ker}(f)$ is either $M$ or $0$.
(2): As $N$ is simple, $\mathrm{im}(f)$ is either $N$ or $0$.
(3): Follows from (1) and (2).
(4): Follows from (3).

Despite the easy proof, Schur’s lemma is quite useful and will be a constant companion while dealing with simple modules. We give a first application.

Lemma 3.19 Let $M$ be a semisimple module that is a finite direct sum of simple submodules, write $M \cong \bigoplus_{i=1}^n M_i^{e_i}$ where the $M_i$ are pairwise non-isomorphic. Then for every $i$, set $D_i=\mathrm{End}_R(M_i)$. Then we have an equality $e_i = \mathrm{dim}_{D_i}(\mathrm{Hom}_R(M_i,M))= \mathrm{dim}_{D_i}(\mathrm{Hom}_R(M,M_i))$. In particular, the exponent $e_i$ is independent of the decomposition, so the decomposition is unique up isomorphism and permutation of the factors.

Proof  $\mathrm{Hom}_R (M_i,M)$ $=\mathrm{Hom}_R(M_i,\bigoplus_{j=1}^n M_j^{e_j})$ $\cong \bigoplus_{j=1}^n \mathrm{Hom}_R(M_i, M_j)^{e_j}$
Note that this isomorphism is $D_i$-linear, because the action of $D_i$ is given by composition in the first argument.
Schur’s lemma implies $\mathrm{Hom}_R(M_i,M_j) = 0$ unless $i=j$, so we get $\bigoplus_{j=1}^n \mathrm{Hom}_R(M_i, M_j)^{e_j} \cong \mathrm{Hom}_R(M_i, M_i)^{e_i}= D_i^{e_i}$. The case with switched arguments works the in the same way.

Corollary 3.20 Let $G$ be a finite group and let $K$ be a field such that the characteristic of $K$ does not divide the order of $G$, then every finite-dimensional representation can be written as a direct sum of irreducible subrepresentations which are uniquely determined up to isomorphism, including their multiplicity.

Proof Existence follows from 3.16 and uniqueness from 3.19

The last corollary justifies why one pays a lot of attention to irreducible representations, especially when Maschke’s theorem applies.

Semisimple Rings and Algebras

So far, we have just studied (semi)simple modules. A general philosophy in ring theory is to study relations between the internal structure of a ring and the structure of its modules. Whenever there’s a notion for modules, one possible definition for a ring-theoretic property is obtained by just considering a ring as a left, right or two-sided module over itself. (For technical reasons, we will work with right ideals and modules in this section. It will allow us to skip some passage from a ring to its opposite ring in a future post. One can dualize all statements by using that left $R$-modules are right $R^{op}$-modules, where $R^{op}$ is the opposite ring which has reversed order of multiplication. Note that the group algebra $K[G]$ is isomorphic to its own opposite ring, via the map given on the basis $G$ by $g \mapsto g^{-1}$.)

If we apply this to the properties we’ve been studying, we get that a ring that is simple as a right module over itself is just a division ring. The way to see this is that every non-zero element must generate the whole ring as a right deal (and by group theory it’s enough to have all right inverses.). We already have a name for that, so that’s nothing new. This doesn’t happen with the following definition:

Definition 3.21 A ring $R$ is called semisimple if it is semisimple as a right module over itself.

We’re deliberately not being careful with the chirality here: Theoretically, one should define left and right semisimple, but as we shall see that they are equivalent.

We can apply the theory we have developed for semisimple modules to show how this property is reflected in the structure of the modules over a ring:

Lemma 3.22 A ring $R$ is semisimple if and only if all right modules over $R$ are semisimple.

Proof One direction is obvious. For the other one, note that if $R$ is semisimple, 3.15 implies that all direct sums of $R$, i.e. all free modules are semisimple. By 3.10, this also shows that all quotients of free modules are semisimple. But every module is a quotient of a free module.

Remarkably, this tells us that it would have been sufficient to prove Maschke’s theorem for just for one single representation, the one corresponding to the $K[G]$-module $K[G]$ to get decompositions into irreducible representations. (Even infinite-dimensional ones.)

Lemma 3.23 If a ring is a direct sum of non-zero right ideals, then the sum is finite.

Proof Suppose $R=\bigoplus_{i \in I} J_i$, then we have $1=(a_i)_{i \in I}$ where all but finitely many $a_i$ are zero. Let $I' \subset I$ be the subset of $I$ consisting of the indices $i$ such that $a_i \neq 0$. Then for any $r \in R$, we have $r= 1 \cdot r = \sum_{i \in I'} a_ir$ because the sum is direct, this expression is the unique way to write $r$ as a sum from elements in $J_i$ where $i$ ranges over $I$. since we assumed that all $J_i$ are non-zero, this implies $I=I'$, so that $I$ is finite.

Corollary 3.24 A semisimple ring is a finite direct sum of simple right $R$-modules (also called minimal right ideals in this case.)

Proof Apply 3.14 and then 3.23.

Corollary 3.25 Let $R$ be a semisimple ring, then every simple right $R$-modules $M_i$ occurs as a direct summand of $R$ (as a right $R$-module over itself) and the multiplicity is equal to the dimension of $M_i$ over its endomorphism ring (which is a division ring). In particular, that dimension is finite.

Proof Note that 3.24 implies that 3.19 is applicable to $R$ (by which we always mean as a right module over itself in this proof).
Let $e_i$ be the multiplicity with which $M_i$ occurs in the decomposition of $R$ as a direct sum of simple submodules. By 3.19 $e_i$ is independent of the decomposition, but it might be zero. But 3.19 tells us that $e_i=\mathrm{dim}_{D_i}(\mathrm{Hom}_R(R,M_i))=\mathrm{dim}_{D_i}(M_i)$ which also tells us two things:
1) The RHS is finite
2) The LHS is non-zero, as $M_i \neq 0$.

We want to apply this to the case where $R$ is an algebra over a field $K$, but for this it would be nice to know that the $D_i$ are finite-dimensional over $K$. We need some easy results on finiteness conditions.

Lemma 3.26 Let $K$ be a field and let $M$ and $N$ be modules over a finite-dimensional algebra $A$, then if $M$ and $N$ are finitely-generated over $A$, they are finite-dimensional over $K$ and so is $\mathrm{Hom}_A(M,N)$.

Proof $M$ being finitely-generated means that we can find a $A$-linear surjection $A^n \to M$. As $A$ is a $K$-algebra, this surjection is also $K$-linear. $A^n$ is finite-dimensional over $K$, because $A$ is, this implies that $M$ is finite-dimensional over $K$. If $M$ is finitely generated, let $S$ be a finite-generating system, then the map $\mathrm{Hom}_R(M,N) \to N^S, f \mapsto (f(s))_{s \in S}$ is $K$-linear. It is also injective, because any map from $M$ is determined by where it sends the generating system $S$. $N^S$ is a finite-dimensional vector space by the previous part, thus $\mathrm{Hom}_R(M,N)$ is finite-dimensional.

Corollary 3.27 Let $A$ is a finite-dimensional algebra over a field $K$, then all simple modules and their endomorphism rings are finite-dimensional over $K$.

Lemma 3.28 Let $A$ be a semisimple algebra over a field and let $M_1, \dots, M_n$ be a list of all simple modules, up to isomorphism. Let $D_i=\mathrm{End}_A(M_i)$ be their endomorphism rings. Then $\mathrm{dim}_K(A)=\sum_{i=1}^n \mathrm{dim}_K(M_i)^2/\mathrm{dim}_K(D_i)$ (where all dimensions are finite.)

Proof 3.25 implies that $A \cong \bigoplus_{i=1}^n M_i^{e_i}$ where $e_i=\mathrm{dim}_{D_i}(M_i)$, this implies that $\mathrm{dim}_K(A)= \sum_{i=1}^n \mathrm{dim}_K(M_i)\mathrm{dim}_{D_i}(M_i)$. (3.27 tells us that we don’t have to worry about infinite dimension.)
So the only thing left to show is that $\mathrm{dim}_K(D_i) \mathrm{dim}_{D_i}(M_i)=\mathrm{dim}_K(M_i)$. But this is clear: We have $M_i=D_i^{e_i}$, so we just compare the $K$-dimension of both sides.

The following lemma tells us that we can leave out the factors if $\mathrm{dim}_K(D_i)$ if $K$ is algebraically closed.

Lemma 3.29 If $K$ is an algebraically closed field, then every finite-dimensional division algebra over $K$ is one-dimensional, i.e. $K$ itself.

Proof Let $D$ be finite-dimensional division algebra over $K$.
Let $d \in D$, then consider the $K$-subalgebra $K[d]$ generated by $d$. Every element in $K[d]$ is a polynomial in $d$, so $K[d]$ is a quotient of the polynomial ring $K[x]$ via the map $ev_x: x \mapsto d$. But then $\mathrm{ker}(ev_x)$ is a non-zero prime ideal, as the image is finite-dimensional over $K$ and doesn’t contain zero divisors. This implies $\mathrm{ker}(ev_x)=(x-\lambda)$ for some $\lambda \in K$, so $d=\lambda \in K$.

We note the following corollary to 3.28 and 3.29

Corollary 3.30  Let $A$ be a semisimple algebra over a field and let $M_1, \dots, M_n$ be a list of all simple modules, up to isomorphism. Then $\mathrm{dim}_K(A)\leq \sum_{i=1}^n \mathrm{dim}_K(M_i)^2$ and we have equality if $K$ is algebraically closed.

If we put in some knowledge about finite-dimensional divison algebras over $\Bbb R$ (namely, the fact that the only ones are $\Bbb R, \Bbb C, \Bbb H$, so the dimension is at most 4), we also get the following:

Corollary 3.31 Let $A$ be a semisimple algebra over $\Bbb R$ and let $M_1, \dots, M_n$ be a list of all simple modules, up to isomorphism. Then $\frac{1}{4} \sum_{i=1}^n \mathrm{dim}_\Bbb{R}(M_i)^2 \leq \mathrm{dim}_\Bbb{R}(A)\leq \sum_{i=1}^n \mathrm{dim}_\Bbb{R}(M_i)^2$

These corollaries translate into statements about representations when we apply them to the group algebra $K[G]$.

Let’s close this post by recapitulating what we have shown about representations in the case where $G$ is finite and the characteristic of $K$ doesn’t divide the order of $G$:

• There are finitely many irreducible representations up to isomorphism(which are all finite-dimensional)
• Every irreducible representation occurs as a direct summand of the so called “regular representation”, which is the representation corresponding to $K[G]$ as a module over itself.
• Every representation is a direct sum of copies of irreducible subrepresentations, even infinite-dimensional ones.
• We know that for finite-dimensional representations the decomposition into a direct sum of irreducibles is unique up to isomorphism of the factors, including multiplicities. (I didn’t want to deal with cardinals for the infinite-dimensional case)
• We have a nice formula that relates the dimension of irreducibles, the dimension of their endomorphism rings and the order of $G$ (which is obviously the dimension of $K[G]$). If we don’t want to talk about the endomorphism rings, we still have an inequality, which is an equality in the algebraically closed case.

In the next post, we will continue our study of semisimple rings and give applications, e.g. by describing the number of irreducible representations in terms of the group $G$.

The Group Algebra: Another Perspective on Representations

In this post, we introduce the group algebra as another way to view representations and illustrate the usefulness of this approach by studying representations of cyclic groups by elementary ring theory. This is the second part of a series that started with this post. The numbering is consecutive, i.e. when I refer to some result 1.x, then this is written in that post.

A Review of Modules in Linear Algebra

We will begin by reviewing what modules over the polynomial ring $K[X]$ mean in terms of linear algebra, as this will be helpful for motivating the module-theoretic perspective on representations.

Let $K$ be a field and $V$ be a (say finite-dimensional) vector space over $K$ and let $A$ be a $K$-linear endomorphism of $V$ (so after choosing a basis, we can think of $A$ as a square matrix.) Suppose we wish to understand $A$, e.g. find a basis such that $A$ has a particularly nice matrix representation with respect to that basis.

From the pair $(V,A)$, we can define a $K[X]$-module structure on $V$ by defining $K[X]$-scalar multiplication via $(\sum_{i=0}^n \lambda_i X^i)v= \sum_{i=0}^n \lambda_i A^i(v)$, where $A^i(v)$ means that we apply $A$ to $v$ $i$ times.

Conversely, given a $K[X]$-module $M$, we can think of it as a $K$-vector space $V=M$ by restricting the scalar multiplication to $K \subset K[X]$. We also get a $K$-linear endomorphism of $V$ given by multiplication with $X$.

These constructions are inverse to each other: Going from a pair $(V,A)$ to the associated $K[X]$-module, multiplication with $X$ is precisely the endomorphism we started with.
For a $K[X]$-module, because every polynomial is a linear combination of powers of $X$, we only need to know $K$-scalar multiplication and how $X$ acts to reconstruct the $K[X]$-scalar multiplication.

Thus, we can think of pairs $(V,A)$ of vector spaces equipped with an endomorphism as $K[X]$-modules and we can translate between notions for endomorphisms and notions for $K[X]$.

Here’s an excerpt of a possible dictionary one might use for translation:

Pairs $(V,A)$ of vector spaces and endomorphisms $K[X]$-modules $M$
Subspaces $W$ that are invariant under $A$, i.e. $A(W) \subset W$ $K[X]$-submodules
For pairs $(V,A)$, $(W,B)$, a $K$-linear map $f:V \to W$ such that $f \circ A = B \circ f$ $K[X]$-linear maps
For pairs $(V,A)$, $(W,B)$, a $K$-linear isomorphism $f:V \to W$ such that $A =f^{-1} \circ B\circ f$ $K[X]$-linear isomorphisms
Eigenspace of $A$ associated to $\lambda$ The submodule of $M$ consisting of all elements annihilated by $X-\lambda$
The minimal polynomial of $A$ The unique monic generator of the annihilator ideal associated to $M$, i.e. the unique monic polynomial $P \in K[X]$ of minimal possible degree such that $P(v)=0$ for all $v \in V$

One could add many more rows.

The important part for finding e.g. a nice basis for $A$ is the third row: If we can find a module that is isomorphic to the $K[X]$-module associated to $(V,A)$ such that we write down a basis such that we get a nice matrix representation for multiplication with $X$, then the third row tells us that this is also a possible matrix representation for $A$!

Now as $K[X]$ is a PID, finitely generated modules over $K[X]$ (in particular those modules that are finite-dimensional over $K$) are very well-understood. There’s a structure theorem that tells us that they are finite direct sums of modules of the form $K[X]/(f)$, where $f \in K[X]$. (One can also put some conditions of $f$ to make them unique.) From there, one can easily deduce existence and uniqueness of canonical forms such as the Jordan normal form, the Frobenius normal form, and also properties of the minimal polynomial such as Cayley-Hamilton.

The Group Algebra

Let $G$ be a group and $V$ be a representation of $G$ over a field $K$. By abuse of notation, we denote the action of an element $g \in G$ on a vector $v$ by $gv$. Let $\lambda \in K$, $g \in G$ and $v \in V$, then there seems to be only one sensible way to define what we mean by $(\lambda+g)v$, clearly, this has to be $\lambda v + gv$ if any sensible rules hold.
But what is the expression $\lambda+g$ supposed to mean? After all, we can’t just add an element of a group and an element of a field.

Or can we?

Definition 2.1 Let $G$ be a group (we denote the neutral element by $1$) and $K$ be a field, then the group algebra $K[G]$ is defined as the vector space over $K$ freely generated by the elements of $G$. We denote the elements of the basis corresponding to elements in $G$ by the same symbols. Group multiplication $G \times G \to G$ defines a multiplication of basis elements that we extend linearly in each argument. This defines a multiplication $K[G] \times K[G] \to K[G]$ that extends the multiplication of $G$ and makes $K[G]$ into a $K$-algebra.

We leave the verification of the ring axioms to the reader. Intuitively, the group algebra $K[G]$ consists of finite formal linear combinations of elements in $G$, i.e. we can write them as $\sum_{g\in G} \lambda_g g$ where all but finitely many coefficients vanish. To compute a product of two such expressions, we expand using distributivity and then use the group multiplication to multiply the products of the basis vectors.

Lemma 2.2 For every representation $V$ of $G$ over $K$, there is a unique way to define a $K[G]$-module structure that extends the given group action and $K$-scalar multiplication, conversely, every $K[G]$-module gives rise to a representation in a canonical way. Using this identification, a morphism of representations corresponds precisely to a $K[G]$-linear map.

Proof Given a representation $V$, $v \in V$ and an element $\sum_{g \in G}\lambda_g g \in K[G]$ (this means the sum is finite), the only possible way to define $(\sum_{g \in G}\lambda_g g)v$ such that the module axioms hold and $K \subset K[G]$ acts via the given scalar multiplication and $G \subset K[G]$ acts via the group action is to set $(\sum_{g \in G}\lambda_g g)v=\sum_{g \in G} \lambda_g(g(v))$.
Note that the RHS is just defined in terms of the group action and the vector space structure. One checks that this defines a module structure, using the linearity of the group action.
Conversely, if we have a $K[G]$-module $V$, we can turn $V$ into a $K$-vector space by restricting scalars to the subring $K \subset K[G]$.
We can also restrict the scalar multiplication to the subset $G \subset K[G]$. Associativitiy and unitality in the axioms for a module imply that this defines a group action.
Finally, group action and vector space operations are compatible due the equality $g\lambda = \lambda g$ in $K[G]$ and associativity and distributivity of the scalar multiplication.
The correspondence of morphisms of representations and $K[G]$-linear maps follows by similar arguments: the idea is that every element in $K[G]$ is a $K$-linear combination of elements in $G$, so it’s enough that a map commutes with $K$-scalar multiplication and the $G$-action to see that it preverses $K[G]$-scalar multiplication.

The relation between the description of linear-algebraic objects as $K[X]$-modules and representations as $K[G]$-modules is as follows:
In the former, we looked at any endomorphism without any condition, but only at one endomorphism at a time, that’s why the $K$-algebra of choice to describe such an object is the polynomial algebra $K[X]$ which is generated freely by one element $X$, i.e. we don’t impose any relation.
For group representations, we consider many endomorphisms (actually automorphisms) at once, subject to all the relations that hold in the group $G$. That’s why $K[G]$ isn’t necessarily generated by one element and by inheriting the multiplication from $G$, $K[G]$ also inherits all the relations between elements in $G$.

With lemma 2.2, we have added another characterization of representations to our collection (cf. lemma 1.3).
If one is really careful with the constructions in the lemma, one sees that it defines an isomorphism of categories which is just a formalization of the inuition that representations and $K[G]$-modules are exactly the same, just with a different point of view.

Let’s also mention the universal property of the group algebra, which can be quite useful even if you’re not a category-aficionado and implies lemma 2.2 as a special case.

Lemma 2.3 Let $K$ be a field and $G$ be a group, then for any $K$-algebra $A$ and every group homomorphism $\varphi:G \to A^\times$ to the group of units, there is a unique $K$-algebra homomorphism $K[\varphi]:K[G] \to A$ that extends $\varphi$.

Proof The same argument as in lemma 2.2 applies: The only way we can define an extension that is $K$-linear is by sending $\sum_{g \in G} \lambda_g g$ to $\sum_{g \in G} \lambda_g \varphi(g)$, this just follows from the fact that $G$ is a basis for $K[G]$. One checks that because $\varphi$ is a group homomorphism and the multiplication in $K[G]$ is inherited from $G$, this also respects the unit element and multiplication, so it is a $K$-algebra homomorphism.

To see how this implies one direction in lemma 2.2, note that for a vector space $V$, the endomorphism ring $\mathrm{End}_K(V)$ is a $K$-algebra (here multiplication is composition) and for the group of units, we get $\mathrm{End}_K(V)^\times=\mathrm{GL}(V)$.
Thus if we have a group homomorphism $\varphi:G \to \mathrm{GL}(V)$, the universal property tells us that there is a unique $K$-algebra homomorphism $K[\varphi]: K[G] \to \mathrm{End}_K(V)$. Now we can define a module structure by uncurrying:
Define for $r \in K[G]$ and $v \in V, r \cdot v:=K[\varphi](r) (v)$
The fact that $K[\varphi]$ is a ring homomorphism translate neatly into the module axioms and $K$-linearity gives us that the $K$-scalar multiplication on $V$ remains the same.

Having this perspective is quite useful, because there are a lot of constructions for modules that now carry over directly to representations: we can form direct sums and products of representations, quotients etc. and all the properties of those constructions that we know to hold for modules also hold in this case. For example, subrepresentations as defined in 1.18 are the same as $K[G]$-submodules.

Representations of Cyclic Groups

We will use the accessible example of cyclic groups to show how the structure of the group algebra contains information about representations.

Lemma 2.4 If $G=\langle g \rangle$ is cyclic of order $n$, then $K[X]/(X^n-1) \cong K[G]$, where the isomorphism sends $X$ to $g$.

Proof One can take the map $K[X] \to K[G]$ that sends $X$ \to $g$ and compute the kernel. As $g$ generates $G$, so every element in $K[G]$ is a polynomial in $G$, which implies the surjectivity of that map.
Let’s instead show that both satisfy the same universal property:

• If $A$ is any $K$-algebra, then a $K$-algebra homomorphism $K[G] \to A$ corresponds to a group homomorphism $G \to A^\times$ by lemma 2.3.
Since $G=\langle g\rangle$ is generated by $g$, a group homomorphism is uniquely determined by where it sends $g$. As $g$ has order $n$, we can send it precisely to those elements $a \in A$ such that $a^n=1$ (This condition automatically give us that $a\in A^\times$). Thus $K[G]$ has the following universal property for this choice of $G$:
For any $K$-algebra $A$ and every element $a \in A$ such that $a^n=1$, there’s a unique $K$-algebra homomorphism $K[G] \to A$ that sends $g$ to $a$.
• If $A$ is still any $K$-algebra, then by the homomorphism theorem, a $K$-algebra homomorphism $K[X]/(x^n-1) \to A$ is the same as a $K$-algebra homomorphism $K[X] \to A$ that sends $x^n-1$ to $0$. A $K$-algebra homomorphism $K[X] \to A$ is uniquely determined by where it sends $X$ and we can send it to every element in $a \in A$, but due to the condition that $x^n-1$ must be sent to $0$, for homomorphisms from $K[x]/(X^n-1)$, we can send $X$ to precisely those elements $a \in A$ such that $a^n=1$.
Thus we have proved:
For any $K$-algebra $A$ and every element $a \in A$ such that $a^n=1$, there’s a unique $K$-algebra homomorphism $K[X]/(X^n-1) \to A$ that sends $X$ to $a$.

At this point, to finish the proof, one can either mumble something about the Yoneda lemma with a smug expression or one can make the usual argument why two objects with the same universal property are isomorphic. (This should be familiar to anyone who has seen e.g. why the tensor product is unique)
Let’s do the latter: Because of the universal property of $K[X]/(X^n-1)$, we can find a unique $K$-algebra homomorphism $\varphi:K[X]/(X^n-1)$ that sends $X$ to $g$. This also works in the other direction: we get a unique $K$-algebra homomorphism $\theta: K[G] \to K[X]/(X^n-1)$. Then $\varphi \circ \theta$ is a $K$-algebra homomorphism $K[G] \to K[G]$ that sends $g$ to itself.
By the universal property, there can only be one such homomorphism, but we know that the identity is an example. Therefore $\varphi \circ \theta = \mathrm{id}$.
By the same argument, $\theta \circ \varphi = \mathrm{id}$.

Exercise Do a similar argument to determine the group algebra $K[G]$ where $G$ is a product of two cyclic groups as a quotient of the polynomial ring in two variables. Why can this approach not work in this form for nonabelian groups?

We can use this to describe the representations of cyclic groups by decomposing $K[X]/(X^n-1)$ with the Chinese remainder theorem. If we do that, we will end up with a product of rings, which is one of the reasons why it’s useful to think about modules over products of rings. If $R$ and $S$ are rings, then for every pair $(M, N)$ where $M$ is a $R$-module and $N$ is a $S$-module, we can make $M \oplus N$ into an $R \times S$-module by having $R$ act on the left factor and $S$ on the right factors. The following lemma tells us that every $R \times S$-module arises in such a way:

Lemma 2.5 If $R$ and $S$ are rings, then every $R\times S$-module is isomorphic to a direct sum $M \oplus N$ where $M$ is a $R$-module and $N$ is a $S$-module such that $R\times \{0\} \subset R \times S$ just acts on the first factor and $\{0\} \times S \subset R \times S$ just on the second one.
$M$ and $N$ are canonically determined.

Proof Let $X$ be a $T= R \times S$-module. Consider the central idempotents $e:=(1,0), f:=(0,1)$. Then $Te=eT=R \times \{0\}, Tf=fT=\{0\} \times S$ and $Te \cap Tf= \{0\}, Te+Tf=T$. Then set $M=eX$ and $N=fX$, we get that $eX \cap fX = 0, eX+fX=X$, so $X = eX \oplus fX = M \oplus N$. It’s clear that $S$ acts trivialy on $eX$ and $R$ acts trivially on $fX$ which shows the statement. We can use the same $e$ and $f$ for all modues, which makes this decomposition canonical.

(Note: The above construction can be enhanced into a category equivalence of $(R\times S)\textrm{-}\mathbf{Mod}$ and $R\textrm{-}\mathbf{Mod} \times S\textrm{-}\mathbf{Mod}$)

Now let’s finally describe the representations of cyclic groups over $\mathbb{C}$!

If $G$ is cyclic of order $n$, generated by $g$, then by Lemma 2.4, $\mathbb{C}[G] \cong \mathbb{C}[X]/(X^n-1)$, where the isomorphism sends $g$ to $X$. Let $\zeta_n = \exp(2\pi i/n)$, then we have the factorization $X^n-1=\prod_{k=0}^{n-1}(X-\zeta_n^k)$, so by the Chinese remainder theorem, we get
$\mathbb{C}[X]/(X^n-1) \cong \prod_{k=0}^{n-1} \Bbb{C}[X]/(X - \zeta_n^k)$
Note that this is an isomorphism both of rings, but also of $K[X]$-modules, which means that we send $X$ to $X$ in each component.

Therefore, by lemma 2.5, every $\mathbb C[G]$-module is a direct sum of $\Bbb{C}[X]/(X - \zeta_n^k)$-modules where $k$ varies. But for each $k$, we have $\Bbb{C}[X]/(X-\zeta_n^k) \cong \Bbb{C}$ via sending $X$ to $\zeta_n^k$. Modules over a field $F$ are easy to understand: they are just a (possibly infinite) sum of $F$. Thus we get that every $\Bbb{C}[X]/(X-\zeta_n^k)$-module is a direct sum of copies of $\Bbb{C}[X]/(X-\zeta_n^k) \cong \Bbb{C}$.

Through all the isomorphisms, we have kept track where the generator $g$ is sent: we send it first to $X$, then $X$ to $X$ (modulo something different in the CRT isomorphism), then $X$ to $\zeta_n^k$. This means that $g$ acts on the (one-dimensional) $\mathbb{C}[G]$-module corresponding to $\Bbb{C}[X]/(X-\zeta_n^k)$ by multiplication with $\zeta_n^k$. And in general, all modules are a direct sum of such modules (for different $k$), this means that $G$ really acts as a diagonal matrix where all the diagonal entries are $n$-th roots of unity. (Even for infinite-dimensional representations.)

We can also say something about a more general setting. Suppose that the characteristic of $K$ does not divide $n$. Then $X^n-1$ has distinct roots, so we can factorize $X^n-1=f_1 \cdot f_k$ where all $f_i$ are irreducible and pairwise distinct. Doing the same Chinese remainder theorem argument we get that:

$K[G] \cong \prod_{i=1}^k K[X]/(f_i)$. Now as the $f_i$ are irreducible, $K[X]/(f_i)$ will be a field and the dimension over $K$ will be equal to the degree of $f_i$, so we can again appeal to linear algebra and get the following result:

Lemma 2.6 Let $G$ be a cyclic group of order $n$ and let $K$ be a field such that the characteristic of $K$ does not divide $n$ and let $X^n-1=f_1 \cdot \ldots \cdot f_k$ be the factorization of $X^n-1$ into irreducibles. Then for every $i$, There is a $K[G]$-module corresponding to $f_i$ such that the dimension of the module is equal to the degree of $f_i$ and in general, every $K[G]$-module is a direct sum of such modules.

Example 2.7 For $K=\mathbb{R}$, the only factors that can occur for $X^n-1$ are $(X-1),(X+1)$ and quadratic factors of the form $X^2-2\cos(2\pi k/n)+1$. By choosing a clever basis for the modules corresponding to the quadratic factors, one obtains rotation representations as in example 1.6 (though the angle of rotation will be $2 \pi k/n$ instead of $2 \pi/n$ in the example.)

Example 2.8 For $K=\mathbb{Q}$, the factorization of $X^n-1$ is well-known, it factors as $X^n-1=\prod_{d \mid n} \Phi_d(X)$, where $\Phi_d(X)$ is the $d$-th cyclotomic polynomial. Thus the number of irreducible representations of $G$ over $\Bbb{Q}$ is equal to the number of divisors of $n$ and for each divisor $d$, there’s an irreducible representation of degree $\varphi(d)$.

We can also use this approach to say something about representations of cyclic groups in the case where the characteristic divides the group order. For simplicity, we just treat the case that $K$ has characteristic $2$ and $G$ is cyclic of order $2$. The factorization of $X^2-1$ is just $(X-1)^2$ and we get
$K[G] \cong K[X]/(X-1)^2$. Here the Chinese remainder theorem doesn’t help.
But one can apply the structure theorem for finitely generated modules over a PID mentioned in the first section, noting that every $K[X]/(X-1)^2$-module is also a $K[X]$-module to get that every finitely generated $K[X]/(X-1)^2$-module is a direct sum of copies of $K[X]/(X-1)$ and $K[X]/(X-1)^2$. If we look at the action of $X$ (which corresponds to the generator of $G$) on these modules, we see that it acts by multiplication with $1$ on $K[X]/(X-1)$, i.e. via the identity (we say “trivially”).
The action on $K[X]/(X-1)^2$ is more interesting: Using the basis given by (the residue classes of) $1$ and $X-1$, we see that the action of $X$ corresponds to a transvection action $g\begin{pmatrix}a\\b\end{pmatrix}=\begin{pmatrix}a+b\\b\end{pmatrix}$ where $g$ is a generator of $G$ (cf. example 1.7)

We have seen how the algebraic structure of the group algebra can help to understand representations and in our example of cyclic groups, it turned out that when the conditions for Maschke’s theorem are satisfied, the group algebra is a product of fields.
We will investigate the structure of the group algebra in more detail in future posts and see that this was not a coindidence.

A First Impression of Group Representations

This blog post provides mostly some motivation, basic definitions and examples for group representations, up to Maschke’s theorem. Only familiarity with linear algebra and elementary group theory is required for understanding the main part of this post. However, there are some examples for readers with more background which can be safely ignored. The same is true for all categorical remarks.

This will be the first part of a series.

Introduction

The reason to care about groups is because they act on objects. Group actions arise in many different contexts and can provide insight (into the objects as well as into the groups which act on them).

The basic definition of a group action is an action on a set. A set can be thought of as an object without any structure except size (i.e. cardinality). For finite sets, this is more structure than it might seem: we can use counting and divisibility arguments which leads to results such as strong results on p-groups such as the Sylow theorems, the easy-to-prove but ubiquitous orbit-stabilizer theorem or Burnside’s lemma, which has nontrivial combinatorial implications.

Group actions are everywhere in pure mathematics and frequently the object which is acted upon has more structure than just being a set, e.g. a topological space. In these situations, the natural thing to investigate (or to require, if you’re writing a definition) is some compatibility between the group action and the structure on the object. In the example of a topological space, one would require the action to be continuous.

A common structure are vector spaces. Their usefulness seems to stem from the fact that they are both very well understood and still give one a lot of tools to work with: we have duals, tensor products, traces, determinants, eigenvalues etc. Thus, a technique that is used sometimes is “linearization” in which one tries to reduce a problem to linear algebra or at least gain insight by using linear-algebraic methods. Examples are the tangent space of a smooth manifold or variety (and for smooth maps the derivative) or the linearization of a nonlinear ODE.

Representations of groups can be thought of as a linearization of group theory, or more precisely as a linearization of group actions: One can be define them as actions of groups on vector spaces that respect the linear structure. They arise with an ubiquity comparable to (nonlinear, merely “set-theoretic”) group actions. When they do, they are arguably even more useful, since vector spaces have a lot more structure than sets, as described in the previous paragraph.

Group Representations

Definition 1.1 Let $K$ be a field. Then for a group $G$, a ($K$-linear) representation on a $K$-vector space $V$ is a map $\varphi: G \times V \to V$ that satisfies:

• $\forall v \in V: \varphi(e,v)=v$ where $e\in G$ is the neutral element.
• $\forall g,h \in G, v \in V: \varphi(gh,v)=\varphi(g,\varphi(h,v))$
• $\forall g \in G, v,w \in V : \varphi(g,v+w)=\varphi(g,v)+\varphi(g,w)$
• $\forall g \in G, \lambda \in K, v \in V: \varphi(g,\lambda v)=\lambda \varphi(g,v)$

Definition 1.2 In the above situation, the degree or dimension of the representation is the dimension of $V$ over $K$.

Note that the first two axioms state that a representation is a group action and the other two axioms state that for any fixed $g \in G,$ the map $V \to V, v \mapsto \varphi(g,v)$ is $K$-linear. This map is also invertible, since the axioms imply that $v \mapsto \varphi(g^{-1},v)$ is an inverse. By the second axiom, we also get that the map $G \to \mathrm{GL}(V), g \mapsto (v \mapsto \varphi(g,v))$ is a group homomorphism. So by using a currying argument, we have seen that every group representation gives rise to a group homomorphism $G \to \mathrm{GL}(V)$.

Conversely, given a group homomorphism $\rho:G \to \mathrm{GL}(V)$, we can uncurry to get a representation on $V$ by setting $\varphi(g,v) := \rho(g)(v)$

Thus we have another characterization, which allows us to apply concepts defined for group homomorphisms like kernel/image etc, whereas the first characterization allows us to apply notions defined for group actions.

While we’re at it, we may as well add a rephrasing in categorical language of the last characterization and get the following

Lemma 1.3  Let $K$ be a field, then for a group $G$ and a vector space $V$ over $K$, the following data are equivalent:

• A representation of $G$ on $V$ as in definition 1.1
• A group homomorphism $G \to \mathrm{GL}(V)$
• A covariant functor from $G$ considered as a one-object category to the category $K\textrm{-}\mathbf{Mod}$ of $K$-vector spaces that sends the single object in $G$ to $V$

This is the direct analog of equivalent characterizations of group actions: we can also view them as homomorphisms to the group of permutations of a set or as functors from a group to the category of sets.

(Viewing a group as a category works like this: Let $G$ be a group, then define a category $\mathcal{C}$ with a single object $*$ and set $\mathrm{Hom}_{\mathcal{C}}(*,*)=G$. Composition $\mathrm{Hom}_{\mathcal{C}}(*,*) \times \mathrm{Hom}_{\mathcal{C}}(*,*) \to \mathrm{Hom}_{\mathcal{C}}(*,*)$, i.e. $G \times G \to G$ is just group multiplication. Associativity and having an identity follows from the group axioms.)

The last characterization allows one to apply constructions from category theory, such as composing representations with other functors, but we will not use it in this post except as an alternative descrition.

Remark 1.4 One can also consider representations of monoids or the case where $K$ is any ring and $V$ is a module.

Remark 1.5 We have defined only left representations. Reversing the chirality in the definitions is straightforward and gives rise to the notion of right representations.

Now for some examples.

As a zeroth example, note that representations of the trivial group are just vector spaces.

Example 1.6 Let $G=\langle g \rangle$ be a finite cyclic group of order $n$ with a fixed generator $g$, then a homomorphism from $G$ to any group $H$ is determined by where it sends $g$ and we can send $g$ to precisely those elements $h \in H$ such that $h^n=1$, so a representation of $G$ is just a linear automorphism that satisfies this equation.
For $K=\mathbb{R}$, one can take for example a rotation matrix $\begin{pmatrix} \cos(2\pi/n) & -\sin(2\pi/n) \\ \sin(2\pi/n) & \cos(2\pi/n) \end{pmatrix}$ to define a two-dimensional representation of $G$.
For $K=\mathbb{C}$ one can take $\zeta_n=\exp{2\pi i/ n} \in \mathbb{C}^\times = \mathrm{GL}_1(\mathbb{C})$ to get a one-dimensional representation of $G$. Both representations correspond to having $g^k$ act by a counterclockwise rotation of $2\pi k/n$ degrees. The former can be obtained from the latter by restricting scalars from $\mathbb{C}$ to $\mathbb{R}$.

Example 1.7 Let $G=K$, the additive group of $K$, then $G$ acts on $V= K^2$ via transvections: $\phi\left(\lambda,\begin{pmatrix}a\\b\end{pmatrix}\right)=\begin{pmatrix}a+b\lambda\\b\end{pmatrix}$

Example 1.8 If $X$ is a $G$-set (a set equipped with an action from $G$), then we can consider a vector space $V$ such that the basis elements $e_x$ are indexed by $X$. We can then define the action on the basis elements by setting $\varphi(g,e_x)=e_{gx}$. This permutation of the basis elements extends uniquely to a linear automorphism of $V$ and we get a respresentation, called the permutation representation associated to the group action.
From a categorical standpoint, if we view $G$-sets as functors $G \to \mathbf{Set}$ and representations as functors $G \to K\textrm{-} \mathbf{Mod}$, then this construction is just composing a group action with the free module functor $\mathbf{Set} \to K\textrm{-} \mathbf{Mod}$. (Thus this construction is also functorial with the notion of morphisms of representations to be defined later.)

Example 1.9 Let $V$ be any $K$ vector space. Let $G=\mathrm{GL}(V)$. Then the identity $G \to \mathrm{GL}(V)$ defines a representation. This corresponds to the natural action of $G$ on $V$ that comes from the definition of $G$ as $K$-linear automorphisms.

Example 1.10 Generalizing the last example, all classical matrix groups such as $\mathrm{SL}_n$, $\mathrm{O}_n$, $\mathrm{U}_n$, $\mathrm{Sp}_{2n}$ etc. are defined as subgroups of some general linear group, so that the subgroup inclusion defines a representation.

Example 1.11 Let $G$ be a finite group and let $p$ be a prime number. Suppose $H$ is a normal abelian subgroup of exponent $p$. Then $H$ is a $K=\mathbb{F}_p$ vector space and the conjugation action of $G$ on $H$ is $\mathbb{F}_p$-linear (this is automatic: any group homomorphism between vector spaces over $\mathbb{F}_p$ is $\mathbb{F}_p$-linear.), thus we obtain a $\mathbb{F}_p$-linear representation.

Example 1.12 Let $L/K$ be a Galois extension. Then $G= \mathrm{Gal}(L/K)$ acts by definition on $L$ by $K$-linear field automorphisms. We can just forget the “field automorphism” part and consider $V=L$ just as a $K$-vector space, then we get a representation $\mathrm{Gal}(L/K) \to \mathrm{GL}_K(L)$.
If $L$ and $K$ are number fields with rings of integers $\mathcal{O}_K$ and $\mathcal{O}_L$ and $\mathfrak{p}$ is a non-zero prime ideal in $\mathcal{O}_K$, then then $G$ acts on $\mathcal{O}_L/\mathfrak{p}\mathcal{O}_L$, giving a $\mathcal{O}_K/\mathfrak{p}$-linear representation.

Example 1.13 Let $G=S_n$ and let $W$ be any vector space over $K$, then $G$ acts on $V = W \otimes_K \dots \otimes_K W$ where we tensor $n$ copies of $W$. Then $G$ acts on $V$ by permuting the components of the tensors. Explicitly, if $w_1 \otimes w_2 \otimes \dots \otimes w_n$ is an elementary tensor, then we can define the result of the action of $\sigma \in S_n$ on that to be $w_{\sigma(1)} \otimes w_{\sigma(2)} \otimes \dots \otimes w_{\sigma(n)}$.

Example 1.14 Let $A$ be a finite-dimensional $K$-algebra, then the group of units $G=A^\times$ acts on $A$ by conjugation and this action is $K$-linear, and thus we obtain a representation of the unit group on the underlying vector space of the algebra. (If $K=\mathbb{R}$ and $A=\mathbb{H}$, then a suitable restriction of the domain and codomain of this representation gives a description of the Hopf fibration.)

Example 1.15 In the same spirit as the last example, let $G$ be an algebraic group over $K$. Let $V= \mathfrak{g}=T_e(G)$ be the Lie algebra of $G$. For any $g \in G$, the conjugation map $c_g:G \to G, a \mapsto gag^{-1}$ is a smooth automorphism of $G$, so we can take the derivative at the identity and get a linear automorphism $\mathrm{ad}(g)=D_e(c_g): V \to V$. The map $g \mapsto \mathrm{ad}(g) \in \mathrm{GL}(V)$ is a representation, called the adjoint representation of $G$. (The same construction works verbatim for Lie groups.)

Example 1.16 Let $M$ be a smooth connected manifold and let $\pi:E \to M$ be a vector bundle with a flat connection. Let $x \in M$ be a base point and set $G=\pi_1(M,x)$ and $K=\mathbb{R}$.  If we take a smooth loop $\gamma: S^1 \to M$ based at $x$, parallel transport along that loop defines an automorphism of $V=T_xM$.
The flatness condition implies that this automorphism depends only on the homotopy class of $\gamma$ and by smooth approximation, every homotopy class of continuous loops may be represented by a smooth loop, thus we obtain the holonomy representation $\pi_1(M,x) \to \mathrm{GL}(T_xM)$.  It turns out that this representation uniquely determines the flat bundle.

As is common practice with group actions, if $\rho:G \to \mathrm{GL}(V)$ is a representation, we also write just $gv$ for $\rho(g)v$ or $g$ for the map $\rho(g)$. By further abuse of notation, we will also just call $V$ a representation of $G$ where the action is clear from the context.

Definition 1.17 If $V$ and $W$ are $K$-linear representations of a group $G$ for some field $K$, then a morphism of representations (also called intertwining operator) from $V$ to $W$ is a $K$-linear map $f:V \to W$ such that $\forall g \in G, v \in V: f(gv)=gf(v)$. (i.e. $f$ is $G$-equivariant.)

Note that if we consider representations as functors, then a morphism of representations is just a natural transformation. Indeed, for any $g \in G$, naturality with respect to $g$ as a morphism is precisely the requirement that $f(gv) = gf(v)$ for all $v$.

Example 1.17 In the situation of example 1.13, let $f \in \mathrm{GL}(W)$, then we can define $f^{\otimes n}$ by acting on each factor: $f^{\otimes n}(v_1 \otimes \dots \otimes v_n)=f(v_1) \otimes \dots \otimes f(v_n)$ for an elementary tensor. Since we act in the same way in each component, this commutes with permutation of the factors, thus $f^{\otimes n}: V \to V$ defines a morphism of the representation of $S_n$  given by permuting the factors in the tensor product.

Definition 1.18 If $V$ is a representation of $G$, then a subspace $W$ that is $G$-invariant (i.e. $gW \subset W$ for all $g \in G$) defines again a representation of $G$. These subspaces are called subrepresentations of $V$.

If $W$ is a subrepresentation of $V$, then the inclusion is a morphism of representations, which gives a (quite general) family of examples for morphisms.

Example 1.19 In the situation of example 1.7, consider the subspace of $K^2$ spanned by $\begin{pmatrix}1\\0\end{pmatrix}$, this is a subrepresentation because $\varphi\left(\lambda,\begin{pmatrix}a\\0\end{pmatrix}\right)=\begin{pmatrix}a\\0\end{pmatrix}$.

Example 1.20 Given a morphism of representations, the kernel and the image are subrepresentations of the domain and codomain, respectively.

Example 1.21 In the situation of example 1.8, suppose that $Y \subset X$ is a sub $G$-set, i.e. we have $gY \subset Y$ for all $g \in G$, then $Y$ is itself a $G$-set and if we apply the same construction to $Y$, the resulting vector space is a subspace of $V$ in a canonical way, and so also a subrepresentation. (This is a special case of the mentioned functoriality of this construction.)
If $X$ is finite, another subrepresentaion is given by the span of $\sum_{x \in G} e_x$.

We now come to the first substantial theorem about representations.

Theorem 1.22 (Maschke) Let $G$ be a finite group and suppose that the order $|G|$ is invertible in $K$. Then if $V$ is a finite-dimensional representation and $W \leq V$ is a subrepresentation, then there exists another subrepresentation $C$ such that $V=W\oplus C$,.

Proof By linear algebra, we can find a $K$-linear projection $\pi: V \to W$, i.e. we have that $\mathrm{im}(\pi)\subset W$ and $\pi$ is the identity on $W$. We have that $V= W \oplus \mathrm{ker}(\pi)$, but of course, $\pi$ will not be a morphism of representations in general. The idea is to “average” $\pi$ to get another projection onto $W$ that is a morphism of representations.
Set $\pi'(v)=\frac{1}{|G|}\sum_{g \in G}g\pi(g^{-1}v)$ (Here we use that $|G|$ is invertible in $K$). This will be $K$-linear again. This is a morphism of representations, as for $h \in G$, we have $\pi'(hv)=\frac{1}{|G|} \sum_{g \in G}h\pi(g^{-1}hv)=\frac{1}{|G|}\sum_{g \in G} hg\pi(g^{-1}v)=h(\frac{1}{|G|} \sum_{g \in G} g\pi(g^{-1}v))=h\pi'(v)$. Since $W$ is a subrepresentation and $\pi$ is the identity on $W$, $\pi'$ is also the identity on $W$ (it is crucial that we divided by $|G|$ for this step.) and the image is also contained in $W$, so $\pi'$ is still a projection onto $W$.
Therefore, the kernel is a complement of $W$ and as $\pi'$ is a morphism of representations, the kernel is a subrepresentation.

Example 1.23 To show that the assumptions in Maschke’s theorem are necessary, consider the transvection representation of the additive group of $K$ on $K^2$ described in example 1.7 and 1.19. Here $K$ acts via $\varphi\left(\lambda,\begin{pmatrix}a\\b\end{pmatrix}\right)=\begin{pmatrix}a+\lambda b\\b\end{pmatrix}$. As described in example 1.19, the subspace $W$ of vectors in $K^2$ with second component $0$ is a subrepresentation.
But this subrepresentation doesn’t have a complement that is also a subrepresentation: Indeed, if $\begin{pmatrix}a\\b \end{pmatrix}$ is any vector in $K^2$ such that $b \neq 0$, then $\begin{pmatrix}a\\b \end{pmatrix}$ and $\varphi\left(1,\begin{pmatrix}a\\b \end{pmatrix}\right)=\begin{pmatrix}a+b\\b \end{pmatrix}$ are linearly independent, as they are clearly not multiples of each other. Thus any subrepresentation that is not contained in $W$ is the whole of $V$, so $W$ doesn’t have a complement.
This serves as a counterexample in two different ways: if we take $K$ to be a finite field, it shows that the assumption that the order is invertible is necessary. If we take $K$ to be an infinite field (say of characteristic $0$), then it shows that even in characteristic $0$, the conclusion doesn’t need to hold when the group is infinite.

Tensor Products for Group Actions, Part 2

In this previous post, tensor products of $G$-sets were introduced and some basic properties were proved, this post is a continuation, so I’ll asume that you’re familiar with the contents of that post.

In this post, unless specified otherwise, $G,H,K,I$ will denote groups. Groups will sometimes be freely identified with the corresponding one-object category. Left $G$-sets will sometimes just be referred to as $G$-sets.

After some lemmas, we will begin this post by some easy consequences of the Hom-Tensor adjunction, which is the main result of the previous post.

Lemma The category $G\textrm{-}\mathbf{Set}$ is complete and cocomplete and the limits and colimits look like in the category of sets (i.e. the forgetful functor $G\textrm{-}\mathbf{Set} \to \mathbf{Set}$ is continuous and cocontinuous) with the “obvious” actions from $G$. Similarly for $\mathbf{Set}\textrm{-}G$.

Proof This is not too difficult to prove directly (you can reduce the existence of (co)limits to (co)products and (co)equalizers by general nonsense), but it also follows directly from the fact that $G\textrm{-}\mathbf{Set}$ is the functor category $[G, \mathbf{Set}]$. The reason is that if $\mathcal{C}$ and $\mathcal{D}$ are categories and $\mathcal{D}$ is (co)complete (and $\mathcal{C}$ is small to avoid any set-theoretic trouble), then the functor category $[\mathcal{C},\mathcal{D}]$ is also (co)complete and the (co)limits may be computed “pointwise”. In the case of $[G, \mathbf{Set}]$, $G$ has only one object, so the (co)limits look like they do in $\mathbf{Set}$.

Lemma If $X$ is a $(G,H)$-set, $Y$ is a $(H,K)$-set and $Z$ is a $(K,I)$-set, then we have a natural isomorphism of $(G,I)$-sets
$(X \otimes_H Y) \otimes_K Z \cong X \otimes_H (Y \otimes_K Z)$

Proof The proof is the same as the proof for modules, mutatis mutandis. Use the universal property of tensor products a lot to get well-defined maps $(X \otimes_H Y) \otimes_K Z \to X \otimes_H (Y \otimes_K Z) , (x \otimes y) \otimes z \mapsto x \otimes (y \otimes z)$ and  $X \otimes_H (Y \otimes_K Z) \to (X \otimes_H Y) \otimes_K Z, x\otimes (y\otimes z) \mapsto (x \otimes y) \otimes z$.

Lemma If $H \leq G$ is a subgroup, and we regard $G$ as a $(G,H)$-set via left and right multiplication, then $\mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G,-): G\textrm{-}\mathbf{Set} \to H\textrm{-}\mathbf{Set}$ is naturally isomorphic to the restriction functor $\mathrm{res}_H^G: G\textrm{-}\mathbf{Set} \to H\textrm{-}\mathbf{Set}$ (this functor takes any $G$-set, which we may think of as a group homomorphism or a functor and restricts it to the subgroup/subcategory given by $H$.)

Proof Define a (natural) map $\varphi: \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G,X) \to \mathrm{Res}_H^G(X)$ via $\varphi(f)=f(1)$. This is $H$-equivariant, because $\varphi(hf)(1)=f(1h)=f(h)=hf(1)=h\varphi(f)$. On the other hand, given $x \in \mathrm{Res}_H^G(X)$ (which is just $X$ as a set), we can define $f \in \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G,X)$ via $f(g)=gx$. This defines an inverse for $\varphi$.

Via the Hom-Tensor-Adjunction this implies

Corollary The restriction functor $\mathrm{Res}_H^G$ has a left adjoint $G \otimes_H - := \mathrm{Ind}_H^G$.

The notation $\mathrm{Ind}$ is chosen because we can think of this functor as an analog to the induced representation from linear representation theory, where we think of group actions as non-linear represenations. (Similar to the induced representation, one can give an explicit description of $\mathrm{Ind}_H^G$ after choosing coset representatives for $G/H$ etc.)
In linear represenation theory, the adjunction between restriction and induction is called Frobenius reciprocity, so if we wish to give our results fancy names (as mathematicians like to do) we can call this corollary “non-linear Frobenius reciprocity”.

If we take $H$ to be the trivial subgroup, we obtain a corollary of the corollary:

Corollary The forgetful functor $G\textrm{-}\mathbf{Set}\to \mathbf{Set}$ has a left adjoint, the “free $G$-set functor”.

Proof If $H$ is the trivial group, then $H$-sets are the same as sets and the restriction functor $G\textrm{-}\mathbf{Set}\to H\textrm{-}\mathbf{Set}$ is the same as the forgetful functor. Since $G \otimes_H$ commutes with coproducts and $H$ is a one-point set, we can also describe this more explicitly: for a set $X$, we have $G \otimes_H X \cong G \otimes_H \coprod_{x \in X} H \cong \coprod_{x \in X} G \otimes_H H \cong \coprod_{x \in X} G := G^{(X)}$

We can also use the Hom-Tensor adjunction to get a description of some tensor products. Let $1$ denote a one-point set (simultanously the trivial group), considered as a $(1,G)$-set with (necessarily) trivial actions.

Lemma For a $G$-set $X$, $1 \otimes_G X$ is naturally isomorphic to the set of orbits $X/G$ and both are left adjoint to the functor $\mathbf{Set} \to G\textrm{-}\mathbf{Set}$ which endows every set with a trivial $G$-action.

Proof Let $Y$ be a set and $X$ be a $G$-set. Denote $Y^{triv}$ the $G$-set with $Y$ as its set and a trivial action. If we have any $f \in \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(X,Y^{triv})$, then $f$ must be constant on the orbits, since $f(gx)=gf(x)=f(x)$, so $f$ descends to a map of sets $X/G \to Y$. Conversely, if we have any map $h: X/G \to Y$, then we can define a $G$-equivariant map $f:X \to Y^{triv}$ by setting $f(x)=h([x])$, where $[x]$ denotes the orbit of $x$. These maps are mutually inverse natural bijection which shows that “set of orbits”-functor is left adjoint to $Y \mapsto Y^{triv}$. On the other hand, we can identify $Y^{triv}$ with $\mathrm{Hom}_{\mathbf{Set}}(1,Y)$ (where the $G$ action is induced from the trivial right $G$-action on $1$), so the left adjoint must be given by $X \mapsto 1 \otimes_G X$. Since adjoints are unique (by a Yoneda argument), we have a natural bijection $1 \otimes_G X \cong X/G$

The set of orbits $X/G$ carries some information about the $G$-set, but we can do a more careful construction which also includes $X/G$ in a natural way as part of the information.

Definition If $X$ is a $G$-set, then the action groupoid $X//G$ is the category with $\mathrm{Obj}(X//G) := X$ and $\mathrm{Hom}_{X//G}(x,y):= \{g \in G \mid gx=y\}$. Composition is given by $\mathrm{Hom}_{X//G}(y,z) \times\mathrm{Hom}_{X//G}(x,y) \to \mathrm{Hom}_{X//G}(x,z), (h,g) \mapsto hg$.

The fact that this is called a groupoid is not important here, one can think of that as just a name (it means that every morphism in $X//G$ is an isomorphism).
The set of isomorphism classes of $X//G$ correspond to the orbits $X/G$. For $x \in X//G$, the endomorphisms $\mathrm{End}_{X//G}$ is the stabilizer group $G_x$. The following lemma shows how to reconstruct a $G$-set $X$ from $X//G$, assuming that we know how all the Hom-sets lie inside $G$.

Lemma (“reconstruction lemma”) If $X$ is a $G$-set, then we define the functor $G: (X//G)^{op} \to G\textrm{-}\mathbf{Set}$ with $G(x)=G$ for all $x \in X//G$ and for $g \in \mathrm{Hom}_{(X//G)}(x,y)$, we define the map $G(g): G(y) \to G(x)$ via $a \mapsto ag$. Then we have $\varinjlim\limits_{x \in (X//G)^{op}}G(x) \cong X$

Proof For $x \in X//G$, define a map $G(x)=G \to X$ via $g \mapsto gx$. This defines a cocone over $G(.)$, so we get an induced map $\varphi: \varinjlim\limits_{x \in (X//G)^{op}}G(x) \to X$.  $\varinjlim\limits_{x \in (X//G)^{op}}G(x)$ can be described explicitly as $\coprod_{x \in (X//G)^{op}} G(x)/\sim$, where the equivalence relation $\sim$ is generated by $ga \in G(x) \sim a \in G(gx)$. To see that $\varphi$ is surjective, note that $x \in X$ is the image of $1 \in G(x)$. To see that $\varphi$ is injective, suppose $g \in G(x)$ and $h \in G(y)$ are sent to the same element, i.e. $gx=hy$, then we have $(h^{-1}g)x=y$, so that we may assume $h=1$. Then $gx=y$ implies that $1 \in G(y)=G(gx) \sim g \in G(x)$, so the two elements which map to the same element are already equal in $\varinjlim\limits_{x \in (X//G)^{op}}G(x)$.

The previous lemma can be thought of as a generalization of the orbit-stabilizer theorem.  (The proof has strong similarities as well.) For illustration, let us derive the usual orbit-stabilizer theorem from it.

Lemma Let $X$ be a $G$-set, then we have an isomorphism of $G$-sets $G/G_x \cong Gx$, where $Gx$ is the orbit of $x$ (with the restricted action) and $G/G_x$ is the coset space of the stabilizer subgroup with left multiplication as the action.

Proof We may replace $X$ with $Gx$ so that we have a transitive action. Then the previous lemma gives us an isomorphism $X \cong \varinjlim\limits_{x \in (X//G)^{op}}G(x)$.
Consider the one-object category $(G_x)$. This can be identified with a full subcategory of $X//G$ corresponding to the object $x$. Because we have a transitive action, all objects in $X//G$ are isomorphic (isomorphism classes correspond to orbits), so that the inclusion functor $G_x \to X//G$ is also essentially surjective, so it is a category equivalence.
We may thus replace the colimit by the colimit $\varinjlim\limits_{x \in (G_x)^{op}}G(x)$. As $(G_x)^{op}$ has just one object, this colimit is a colimit over a bunch of parallel morphism $G(x) \to G(x)$, so it is the simultanous coequalizer of these morphisms. We know how to compute coequalizers in $G\textrm{-}\mathbf{Set}$: the same way that we compute coequalizers in $\mathbf{Set}$. So we have the families of maps $\cdot g: G(x)=G \to G, a \mapsto ag$, where $g$ varies over $G_x$. The coequalizer is the quotient $G/\sim$, where $\sim$ is generated by $a \sim ag$ for each $a \in G$ and $g \in G_x$. But this is exactly the equivalence relation that defines $G/G_x$.

There is another case where the colimit takes a simple form after replacing $X//G$ with an equivalent category.

Lemma A $G$-set $X$ is free in the sense that it is in the essential image of the “free $G$-set functor” $Y \mapsto G \otimes_{1} Y$ or equivalently it is a coproduct of copies of $G$ with the standard action iff the action of $G$ on $X$ is free in the sense that $\forall x \in X \forall g \in G: (gx=x \Rightarrow g=1)$.

Proof It’s clear that if we have a disjoint union $X= \coprod_{i \in I} G$, then no element of $G$ other than $1$ can fix an element in $X$. For the other direction, suppose that we have the condition $\forall x \in X \forall g \in G: (gx=x \Rightarrow g=1)$. This implies that the morphism sets in the action groupoid are really small: Suppose $g,h \in \mathrm{Hom}(x,y)$ such that $gx=y=hx$, which implies that $h^{-1}gx=x$, so $h^{-1}g=1$ by assumption, thus $h=g$. This means that for any pair of objects in $X//G$, there is at most one morphism between them. So if we consider the set $X/G$ as a discrete category (i.e. the only morphisms are the identities), then if we take a representative for each orbit $X/G$, this defines an inclusion of categories $X/G \to X//G$. As elements in $X/G$ represent isomorphism classes in $X//G$, this inclusion is always essentially surjective. By our computations of the Hom-sets, it is also fully faithful if the action of $G$ on $X$ is free. So if we apply the “reconstruction lemma” we get $X \cong \varinjlim\limits_{x \in (X//G)^{op}}G(x) \cong \varinjlim\limits_{x \in X/G} G$. But a colimit over a discrete category is just a coproduct, so this is isomorphic to $\coprod_{x \in X/G} G$ which shows that $X$ is free.

After some further lemmas, we will come to the main result of this post, which is also an application of the reconstruction lemma.

In the previous post, I described $G$-sets in different ways, among them as functors $G \to \mathbf{Set}$, but I didn’t do the same for $(G,H)$-sets. The following lemma remedies this deficiency.

Lemma $(G,H)$-sets may be identified with left $G \times H^{op}$-sets or with functors $G \to \mathbf{Set}\textrm{-}H$ or with functors $H^{op} \to G\textrm{-}\mathbf{Set}$. In other words, we have equivalences of categories $G\textrm{-}\mathbf{Set}\textrm{-}H \cong G\times H^{op}\textrm{-}\mathbf{Set} \cong [G,\mathbf{Set}\textrm{-}H] \cong [H^{op},G\textrm{-}\mathbf{Set}]$.

The proof of this lemma is a lot of rewriting of definitions, not more difficult than proving the corresponding statements for one-sided $G$-sets.

This lemma has a useful consequence, which one could also verify by hand:

Observation If $F: G\textrm{-}\mathbf{Set} \to H\textrm{-}\mathbf{Set}$ is a functor and $X$ is a $(G,K)$-set, then $F(X)$ is a $(H,K)$-set in a “natural” way.

Proof Think of $X$ as functor $X:K^{op} \to G\textrm{-}\mathbf{Set}$, composing with $F$, gives us a functor $F(X): K^{op} \to H\textrm{-}\mathbf{Set}$, which we may also think of as a $(H,K)$-set.
More explicitly, the action of $K$ on $F(X)$ can be described as follows: for $k \in K$, the right-multiplication-map $X \to X, x \mapsto xk$ is left $G$-equivariant, so it induces a left $H$-equivariant map $F(X) \to F(X)$, we can define the action of $k$ on $F(X)$ via this map.

The following lemma is an analog of the classical Eilenberg-Watts theorem from homological algebra which describes colimit-preserving functors $R\textrm{-}\mathbf{Set} \to S\textrm{-}\mathbf{Set}$ as tensor products with a $(S,R)$-bimodule.

Thereom (Eilenberg-Watts theorem for group actions) Every colimit-preserving functor $F: G\textrm{-}\mathbf{Set} \to H\textrm{-}\mathbf{Set}$ is naturally equivalent to $X \otimes_G$ for a $(H,G)$-set $X$. One can explicitly choose $X = F(G)$ (with the $(H,G)$-set structure from the previous observation, as $G$ is a $(G,G)$-set.)

Proof Let $X$ be a $G$-set and $Y$ be a $H$-set, then we have a natural bijection $\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G) \otimes_G X, Y) \cong \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(X,\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y))$
Using the reconstruction lemma, we get $\mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(X,\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)) \cong \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(\varinjlim\limits_{x \in (X//G)^{op}}G(x),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)) \cong \varprojlim\limits_{x \in (X//G)^{op}}\mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G(x),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y))$
For every $x \in (X//G)^{op}$, $G(x)=G$, so $\mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G(x),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)) \cong \mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y))$ via the map $f \mapsto f(1)$. We need to consider how this identification behaves under the morphisms involved in the colimit. For $g \in G$, we have the map $G(gx) \to G(x), a \mapsto ag$, this induces a map $\varphi_g: \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G(x),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)) \to \mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G(gx),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y))$ given by $\varphi_g(f)(h)=f(hg)$. If we make the indentification described above by evaluating both sides at $1$, we get $\varphi_g(f)(1)=f(1g)=gf(1)$. Using the definition of the $G$-action on the Hom-set $\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)$, this left multiplication translates to right multiplication on $F(G)$. Because of the construction of the right $G$-action on $F(G)$, this right multiplication is the map that is induced from right multiplication $G \to G$. We may summarize this computation by stating that $\varprojlim\limits_{x \in (X//G)^{op}}\mathrm{Hom}_{G\textrm{-}\mathbf{Set}}(G(x),\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G),Y)) \cong \varprojlim\limits_{x \in (X//G)^{op}}\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G(x)),Y)$
Using the assumption that $F$ preserves colimits, we get $\varprojlim\limits_{x \in (X//G)^{op}}\mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(G(x)),Y) \cong \mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(\varinjlim\limits_{x \in (X//G)^{op}}F(G(x)),Y) \cong \mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(\varinjlim\limits_{x \in (X//G)^{op}} G(x)),Y) \cong \mathrm{Hom}_{H\textrm{-}\mathbf{Set}}(F(X),Y)$ where we used the reconstruction lemma again in the last step.
We conclude $F(X) \cong F(G) \otimes_G X$ by the Yoneda lemma.

This theorem (like the classical Eilenberg-Watts-theorem) is remarkable not only because it gives a concrete description of every colimit-preserving functor between certain categories, but also because it shows that such functors are completely determined by the image of one object $G$ and how it acts on the endomorphisms of that object (which are precisely the right-multiplications.)

It’s natural to ask at this point when two functors of the form $X \otimes_G$ and $Y \otimes_G$ for $(H,G)$-sets are naturally isomorphic. It’s not difficult to see that it is sufficient that $X$ and $Y$ are isomorphic as $(H,G)$-sets. The following lemma shows that this is also necessary, among other things.

Lemma For $(H,G)$-sets $X$ and $Y$, every natural transformation $\eta:X \otimes_G \to Y \otimes_G$ is induced by a unique $(H,G)$-equivariant map $f: X \to Y$

Proof Assume we have a natural transformation $\eta_A: X \otimes_G A \to Y \otimes_G A$, then we have in particular a left $H$-equivariant map $\eta_G: X \otimes_G G \to Y \otimes_G G$. We have $X \otimes_G G \cong X$ and $Y \otimes_G G \cong Y$, so this gives us a $H$-equivariant map $X \to Y$ which I call $f$. Clearly $f$ is uniquely determined by this construction. For a fixed $g \in G$, right multiplication by $g$ defines a left $G$-equivariant map $G \to G$. Under the isomorphism $X \otimes_G G \cong X$ these maps describe the right $G$ action on $X$. Naturality with respect to these maps implies that $f$ is right $G$-equivariant.

This lemma allows a reformulation of the previous theorem.

Theorem (Eilenberg-Watts theorem for group actions, alternative version)
The following bicategories are equivalent:
– The bicategory where the objects are groups, 1-morphisms between two groups $G, H$ are $(G,H)$-sets $X$, where the composition of 1-morphisms is given by taking tensor products and 2-morphisms between two $(G,H)$-sets are given by $(G,H)$-equivariant maps.
– The 2-subcategory of the 2-category of categories $\mathbf{Cat}$ where the objects are all the categories $G\textrm{-}\mathbf{Set}$ for groups $G$, 1-morphisms are colimit-preserving functors $G\textrm{-}\mathbf{Set} \to H\textrm{-}\mathbf{Set}$ and 2-morphisms are natural transformations between such functors.

This concludes my second blog post. If you want, please share or leave comments below.