Semisimplicity and Representations, Part 2

This entry is a direct continuation of this post and is part of a larger series on representation theory that starts with that post. In this blog post, we continue our study of semisimple rings and bring it to a conclusion by proving a classification. Then we apply this insight into the structure of semisimple algebras to the the group algebra to get more information on irreducible representations.

 Endomorphism Rings of Semisimple Modules

The goal behind the following lemmas and corollaries is to compute the endomorphism ring of a semisimple module that is a finite direct sum of simple modules, the motivation being that to finite-dimensional representations in the semisimple case (cf. 3.20) and to a semisimple rings as a module over itself (cf. 3.24).
The idea behind the proof is simply to “multiply out” the hom set that gives us endomorphisms by pulling out (finite) direct sums from both arguments and then use Schur’s lemma to understand the hom sets between simple modules.

Definition 4.1 For a ring R and a natural number n, M_n(R) denotes the matrix ring with entries in R with the usual formulas for addition and multiplication.

The following lemma is a generalization of a well-known fact from linear algebra:

Lemma 4.2 Let S be a right module over a ring R, then \mathrm{End}_R(S^n) \cong M_n(\mathrm{End}_R(S))

Proof Lets index the direct summands in S^n = \bigoplus_{i=1}^n S_i (so S_i=S for all i, but we’re keeping track of which component it is)
Note that as abelian groups, the isomorphism is clear, since we have \mathrm{End}_R(S^n) = \mathrm{Hom}_R( \bigoplus_{i=1}^n S_i,  \bigoplus_{j=1}^n S_j) = \bigoplus_{i=1}^n\bigoplus_{i=j}^n \mathrm{Hom}_R(S_i,S_j)
Now for each summand \mathrm{Hom}_R(S_i,S_j)=\mathrm{End}_R(S) at least as an abelian group, which we can use to idenitfy elements in \bigoplus_{i=1}^n\bigoplus_{i=j}^n \mathrm{Hom}_R(S_i,S_j) with elements in M_n(\mathrm{End}_R(S) by sending the component living in \mathrm{Hom}_(S_i,S_j) with the entry at the position (i,j). The verification that this respects multiplication amounts to the same calculation that shows that composition of linear maps corresponds to matrix multiplication.

This gives a description of the endomorphism rings of finite direct sums of simple modules, where each simple submodule is in the same isomorphism class. To properly “multiply out” our endomorphism rings, we just need another application of Schur’s lemma:

Lemma 4.3 Let M and N be modules over a ring R and suppose that \mathrm{Hom}_R(M,N) = 0 = \mathrm{Hom}_R(N,M), then we have as rings \mathrm{End}_R(M \oplus N) \cong \mathrm{End}_R(M) \times \mathrm{End}_R(N)
If R is a K-algebra for some field K, then this an isomorphism of K-algebras.

Proof We always have an injection \mathrm{End}_R(M) \times \mathrm{End}_R(N) \to \mathrm{End}_R(M \oplus N) given by (\varphi,\psi) \mapsto \varphi \oplus \psi. Here \varphi \oplus \psi means that we apply \varphi on the first, and \psi on the second summand. Since addition and composition of such endomorphisms can be carried out component-wise, this is a ring homomorphism and if everything is a K-vector space, it is also K-linear. The image of that map is the set of endomorphisms of M \oplus N that map M and N to itself. Given our conditions, if we have any endomorphism of M \oplus N, restricting to M and composing with the projection M \oplus N \to N, gives a homomorphism M \to N which is zero by assumption. This shows that the endomorphism maps M to itself, and by the same argument N, thus the map is surjective.

Corollary 4.4 Let M be a right module over a ring R that is a finite sum of simple submodules M=\bigoplus_{i=1}^k S_i^{n_i} such that the S_i are pairwise non-isomorphic. Let D_i=\mathrm{End}_R(S_i) be the endomorphism rings, then \mathrm{End}_R(M) \cong \prod_{i=1}^k M_{n_i}(D_i)

Proof By Schur’s lemma, there are no nonzero homomorphisms between semisimple modules that don’t have a simple submodule in common, so that we can apply 4.3 k times to get that \mathrm{End}_R(M) \cong \prod_{i=1}^k \mathrm{End}_R(S_i^{n_i}). Now apply 4.2 to each factor.

Using this, we get a partial converse to Schur’s lemma:

Corollary 4.5 Let M be a module that is a finite sum of simple submodules, then M is simple if and only if the endomorphism ring of M is a division ring.

Proof One direction is just Schur’s lemma. The other direction follows from 4.4 by noting that the only way that \prod_{i=1}^k M_{n_i}(D_i) is a division ring is if k=1=n_1.

Enter the Matrix Ring

Corollary 4.4 already gives us a description of endomorphism rings of modules which are finite sums of simple submodules. In the description, matrix rings over division algebras appear. This means to deepen our understanding of those endomorphism rings, we should study matrix rings.

We begin by studying their ideals:

Lemma 4.6 Let R be a ring and n \in \Bbb N, then the map from two-sided ideals of R to two-sided ideals of M_n(R), sending I to M_n(I) is a bijection. (Here M_n(I) is the subset of all elements in M_n(R) where each entry is contained in I)

Proof It’s clear that M_n(I) is a two-sided ideal in M_n(R) when I is a two-sided ideal in I. It’s also clear that the map I \mapsto M_n(I) is injective (we can recover I from M_n(I) just by looking at the set of elements of R that appear in the entries of the matrices in M_n(I).
For surjectivity, let E_{ij} be the matrix that is zero everwhere except at (i,j), where it is 1. Then if J \subset M_n(R) is a twosided ideal and A \in J, we compute that E_{11}AE_{11} is the matrix that agrees with A at the place (1,1) and is zero everywhere else. A calculation shows that the set of elements in R that appear as an entry in the place (1,1) is a two-sided ideal I in R, but then J\supset M_n(I), because we can permute the entries of a matrix that has only one non-zero entry by multiplying from the left and right by permutation matrices (and then by taking sums, we get that every element in M_n(I) is contained in J.
On the other hand J \subset M_n(I), because we can first multiply a matix from the left and right by permutating matrices and then multiply by E_{11} from the left and right to get that we could have also defined I as the set of elements which appear as any matrix entry for elements in J. Thus J \subset M_n(I).

This lemma implies in particular that if D is a division ring, then M_n(D) has no proper non-zero two-sided ideals. We can use this to together with another property that holds in general for finite-dimensional modules:

Lemma 4.7 Let M be a module over a ring R and suppose that R contains a division ring D such that M is a finite-dimensional D-vector space. (We don’t require that R is a D-algebra, i.e. that D is commutative and in the center of R), then M contains a simple submodule.

Proof Start with any non-zero submodule of M. e.g. M itself. If M is not simple, choose a proper non-zero submodule of M, then if that submodule is not simple choose a proper non-zero submodule etc.
Since the dimension over D has to decrease at every step, this process has to terminate, which gives us a simple submodule.

Lemma 4.8 Let R be a ring that has no proper nonzero two-sided ideals that has a minimal right ideal I, then R is semisimple and every simple module over R is isomorphic to I

Proof Let R be such a ring and let I be a minimal right ideal, then we can make a two-sided ideal out of I by taking the product RI=\sum_{r \in R} rI.
Since R doesn’t contain proer nonzero two-sided ideals and RI \neq 0, we get that R=RI=\sum_{r \in R} rI. For each fixed r, rI is a homomorphic image of the simple module I under the image of the right R-linear map x \mapsto rx, so the image is either isomorphic to I (and hence simple) or zero by Schur’s lemma.
The implication (2.) implies (3.) (compare the proof or the remark after the proof) in 3.14 gives us that there is a subset T \subset R such that R=\bigoplus_{r \in T} rI. Thus R is semisimple. 3.25 implies that every simple module is isomorphic to a direct summand in the sum \bigoplus_{r \in T} rI, but every summand of that is isomorphic to rI which implies that every simple module is isomorphic to I.

A natural question is how to recover R from M_n(R). The following lemma provides this, if we also have the module R^n:

Lemma 4.9 Let R be a ring and consider R^n as the column space, which is a right M_n(R)-module, then we get that \mathrm{End}_{M_n(R)}(R^n)=R. If we think of \mathrm{End}_{M_n(R)}(R^n) as a submodule of \mathrm{End}_{R}(R^n)=M_n(R), then \mathrm{End}_{M_n(R)}(R^n) is given by all scalar multiples of the identity.

Let E_{i,j} be the matrix that has zeroes everywhere except a 1 at (i,j), let e_i be the vector in R^n that has zeroes everyhwere except a one in the i-th component. For any \sigma \in S_n let P_\sigma be the permutation matrix associated to \sigma such that applying that matrix corresponds to permuting the entries with \sigma.
Then for any \varphi \in \mathrm{End}_{M_n(R)}(R^n), we get that \varphi(e_1)=\varphi(e_1 E_{1,1})=\varphi(e_1)E_{1,1}. Now multiplying with E_{1,1} corresponds to making all entries zero except the first entry which is left untouched. Thus \varphi(e_1) is a multiple of e_i, so we can find a unique \lambda \in R such that \varphi(e_1)=\lambda e_1.
For every i let \tau_{1,i} \in S_n be the transposition that switches 1 and i, then we have \varphi(e_i)=\varphi(e_1 P_{\tau_{1,i}})=\varphi(e_1) P_{\tau_{1,i}} = \lambda e_1 P_{\tau_{1,i}} = \lambda e_i.
Since the e_i form a basis, we have seen that \varphi is just scalar multiplication (from the left) with \lambda, or as a matrix \lambda \cdot \mathrm{Id}, where \mathrm{Id} denotes the identity matrix.
This proves \mathrm{End}_{M_n(R)}(R^n)=R.

Putting the last few lemmas together, we obtain the main result on modules over matrix rings over a division algebra:

Lemma 4.10 Let D be a division algebra and n \in \mathbb{N}, then M_n(D) is semismple and the unique simple right module over M_n(D) is D^n and \mathrm{End}_{M_n(D)}(D^n) = D. (Here we’re regarding D^n as row vectors so that the right M_n(D)-action makes sense.)

Proof 4.6 implies that M_n(D) has no nonzero proper two-sided ideals and 4.7 implies that it has a simple submodule, so that 4.8 applies and M_n(D) is semisimple with only one isomorphism class of simple modules.
The statement that D^n is a simple M_n(D)-module is just linear algebra:
Given any non-zero vector v in D^n, for all w \in D^n, there’s a linear transformations (i.e. a matrix) that sends v to w. Thus D^n is a simple module over M_n(D) as every non-zero submodule is the whole module. The part about the endomorphism ring follows from 4.9.

Now we come to the main theorem about semisimple rings that classifies them and also relates their ring-theoretic properties to their modules.

Theorem 4.11 (Artin-Wedderburn) A ring R is a semisimple iff there exist natural numbers k, n_1, \dots, n_k and division rings D_1, \dots D_k such that \displaystyle R \cong \prod_{i=1}^k M_{n_i}(D_i).
Here the k is the number of simple modules M_1, \dots M_k up to isomorphism, D_i is the endomorphism ring of M_i and \mathrm{dim}_{D_i}(M_i)=n_i. In particular, k is uniquely determined and the n_i and D_i are unique up to isomorphism and permutation of the factors.

Proof Note that for every ring R, considered as a right module over itself, we have an isomorphism of rings R \cong \mathrm{End}_R(R) by applying 4.9 with n=1. Let R=\bigoplus_{i=1}^k M_i^{n_1} by a decomposition of R into simple right submodules such that the M_i are pairwise non-isomorphic. Let D_i=\mathrm{End}_R(M_i).
By 4.4, we have an isomorphism \mathrm{End}_R(R) \cong \prod_{i=1}^k M_{n_i}(D_i). Here k is uniquely determined as number of simple modules over R up to isomorphism. The D_i together with the n_i are uniquely determined as the endomorphism rings of the simple modules and the dimension of the simple modules over their endomorphism ring (cf. lemma 4.10 for modules over matrix rings over a division algebra and 2.5 for modules over a product).
For the reverse direction, note that matrix rings over division rings are semisimple by 4.10 and 3.15 implies that finite products of semisimple rings are semisimple.

After having finally proved the main theorem for semisimple rings, we can give lots of applications:

Corollary 4.12 Semisimplicity for rings is left-right symmetric, i.e. if R is symmetric, so it is opposite ring R^{op}.

Proof Using 4.11, it’s enough to show that M_n(D)^{op}\cong M_n(D^{op}) since the opposite ring of a division ring will be a division rings as well.
An isomorphism M_n(D)^{op} \to M_n(D^{op}) is given by transposition A \mapsto A^{T}.

Corollary 4.13 Let A be a semisimple ring, then the following are equivalent:

  1. R is commutative
  2. For all simple R-modules M, the endomorphism ring D=\mathrm{End}_R(D) is a field and \mathrm{dim}_D(M)=1.

Proof We apply 4.11: \prod_{i=1}^k M_{n_i}(D_i) is commutative iff all n_i are 1 and all D_i are commutative, i.e. fields.

Corollary 4.14 Let G be an abelian group, then every irreducible real representation of G is at most 2-dimensional.

Proof Let G be an abelian finite group. Since \mathbb{C} is the algebraic closure of \mathbb{R} and has dimension 2 over \mathbb{R}, we get that all finite field extensions of \mathbb{R} have dimension at most 2. By 4.13 all simple \Bbb{R}[G]-modules are one-dimensional over their endomorphism rings, which are finite field extensions of \mathbb{R} by 3.27 and 4.13, so the endomorphism rings are at most two-dimensional, thus all simple \Bbb{R}[G]-modules have dimension at most 2 over \mathbb{R}.

Corollary 4.15 Let G be a finite group and K be an algebraically closed field such that the characteristic of K doesn’t divide the order of G, then the following are equivalent:

  1. G is abelian
  2. All irreducible representations of G are one-dimensional
  3. The number of irreducible representations is equal to the order of G

Proof The equivalence of (1) and (2) follows from 4.13 and 3.27 and 3.29, since 3.27 and 3.29 together imply that the endomorphism ring of any simple K[G]-module is K, so the statement follows from 4.13.
The equivalence of (2) and (3) follows from 3.30

Reminder For a group G, the abelianization G^{ab} is the largest commutative quotient of G. Explicitly, it is given by the quotient of the subgroup generated by all commutators. It has the universal property that every morphism from G to an abelian group factors uniquely through the map G \to G^{ab}.

Corollary 4.16 Let G be a finite group and K be an algebraically closed field such that the characteristic of K doesn’t divide the order of G, then the number of one-dimensional representations of G up to isomorphism is equal to the order of the abelianization |G^{ab}|

Proof One-dimensional representations are homomorphisms G \to \mathrm{GL}_1(K) = K^\times.
Since K^\times is commutative, these correspond to homomorphisms G^{ab} \to \mathrm{GL}_1(K), i.e. one-dimensional representations of G^{ab}. One-dimensional representations are automatically irreducible and 4.15 tells us that since G^{ab} is abelian, there are exactly |G^{ab}| up to isomorphism.

The Center of the Group Algebra

The last corollary gives a partial answer to the question for the number of irreducible representations of a group, by characterizing the number of one-dimensional representations in terms of the group theory of G. In this section, we give a group-theoretic characterization of the number of all irreducible representations. (Given suitable assumptions on the field)

Definition 4.17 If R is a ring, then the center Z(R) is defined as Z(R)=\{z \in R \mid \forall r \in R: zr=rz \}. It is a commutative subring of R.

Lemma 4.18 If R is a ring and n \in \mathbb{N}, then Z(M_n(R)) consits of those diagonal matrices where all diagonal entries are the same value that lies in Z(R). So we have an isomorphism of rings Z(R) \cong Z(M_n(R)).

Proof We rergard R^n as row vectors which is naturally a right M_n(R)-module. Then for z \in Z(M_n(R)), the map R^n \to R^n, v \mapsto vz is M_n(R)-linear, because for M \in M_n(R), we have vMz=vzM.
By 4.9 z is a diagonal matrix where all diagonal entries are the same. Clearly the diagonal entry must lie in Z(R), as z commutes with other such diagonal matrices.

Lemma 4.19 If R and S are rings, then Z(R \times S)=Z(R) \times Z(S).

Proof This is an easy computation. Multiplication in R \times S is defined component-wise.

Corollary 4.20 If R is a semisimple ring with the Artin-Wedderburn decomposition R \cong \prod_{i=1}^k M_{n_i}(D_i), then Z(R) \cong \prod_{i=1}^k Z(D_i).

Corollary 4.21 If A is a semisimple algebra over a field K, then \mathrm{dim}_K(Z(A)) \geq k where k is the number of simple A-modules up to isomorphism. We have equality if K is algebraically closed.

Proof By 4.11, the number of simple A-modules up to isomorphism is k if A= \prod_{i=1}^k M_{n_i}(D_i). By 4.20, we have Z(R) \cong \prod_{i=1}^k Z(D_i). Each factor has least dimension 1, which shows the inequality.
If K is algebraically closed, then D_i=K for all i by 3.29 which shows that equality holds.

Corollary 4.22 If A is semisimple real algebra and k is the number of simple A-modules up to isomorphism, then 2k \geq \mathrm{dim}_{\mathbb{R}}(Z(A)) \geq k

Proof Argue as in the proof of 2.21 and use the fact that Z(D) where D is a finite-dimensional division algebra over \mathbb{R} has to be isomorphic to \mathbb{R} or \mathbb{C}, since it is a finite field extension of \mathbb{R}, so the dimension is at most two.

These corollaries relate the number of simple modules of a semisimple algebra and their center. Thus the next thing to do is to study the center of group algebras.

Reminder If G is a group, then for g \in G, the conjugacy class of g consists of all elements in G that are conjugate to g. In other words, if we consider the action G \times G \to G (h,g) \mapsto hgh^{-1}, then this is the orbit of g under this action. The different conjugacy classes are the minimal subsets of G that are closed under conjugation and form a partition of G.

Lemma 4.23 Let K be a field and let G be a group (we don’t need it to be finite). Then Z(K[G]) has a K-basis given by the set of elements of the form \sum_{g \in C} g where C varies over all conjugacy classes of G with finitely many elements.

Proof Since K[G] is a K-algebra with a K-basis given by G, we can check if an elements is in the center just by checking if it commutes with every element of G. Let x=\sum_{g \in G} \lambda_g g be an element of Z(K[G]) (so all but finitely many \lambda_g are zero. Then x is in the center of K[G] if and only if for all h \in G, we have xh=hx \Leftrightarrow x=h^{-1}xh.
Writing out x, this equation becomes \sum_{g \in G} \lambda_g g = h^{-1}(\sum_{g \in G} \lambda_g g)h =\sum_{g \in G} \lambda_g h^{-1}gh=\sum_{g \in G} \lambda_{hgh^{-1}} g. Here we used that g \mapsto hgh^{-1} is a bijection.
Comparing coefficients, this is equivalent to \lambda_g = \lambda_{hgh^{-1}} for all g.
If this holds for all g, then we get that the (finite) set of elements X g such that \lambda_g \neq 0 is closed under conjugation and if two elements inside X are conjugate, their coefficients are equal. Take a system of representatives for X/G where G acts on X by conjugation. Then by the above considerations, we get that x=\sum_{g \in G} \lambda_g g=\sum_{g \in X} \lambda_g g= \sum_{g \in X/G} \lambda_g \sum_{h \in C(g)} h, where C(G) is the conjugacy class of g. This shows that the set of elements of the form \sum_{g \in C} g where C is a finite conjugacy class is a generating system for Z(K[G]). It is linearly independent because distinct conjugacy classes are distinct.

Theorem 4.24 If G is a finite group and K is a field such that the characteristic of K doesn’t divide the order of G and let k be the number of irreduible representations, up to isomorphism. Let c be the number of conjugacy classes of G. Then k \leq c with equality if K is algebraically closed. If K=\mathbb{R}, then we have k \leq c \leq 2k.

Proof By 4.23 c is the dimension of the center of K[G]. Now apply 4.21 and 4.22.

Thus we have obtained a relation between the number of irreducible representations and the element structure of G by heavily employing the structure theory for semisimple rings. This is a nice illustration of the power of ring-theoretic tools in representation theory.

Semisimplicity and Representations, Part 1

This post is the third one in a series on representation theory. The previous posts are this one and that one (in this order.) The nature of this post is mostly ring-theoretic, but we will give applications to representation theory throughout the development of the general theory.

Semisimple Modules

Under suitable assumption on G and K, Maschke’s theorem (1.22) tells us that any submodule of a K[G]-module is a direct summand, i.e. we can find a complement.
One can try to apply this repeatedly to decompose a K[G]-module into smaller submodules. If the dimension is finite, then at some point we have to end up with a direct sum of modules that don’t have a non-zero proper submodule. This is because if one direct summand had a non-zero proper submodule, we could just decompose it further by Maschke’s theorem. The assumption of finite dimension implies that this process has to terminate, as the dimension of the summands decreases every time we decompose something.
This motivates the following definition to give a name to the modules we obtained as summands in the end:

Definition 3.1 A non-zero module over a ring is called simple if it doesn’t have a proper non-zero submodule.

Example 3.2 If K is a field, or more generally a division ring, then a vector space over K is simple iff it is one-dimensional.

Example 3.3 If we consider modules over \mathbb{Z}, i.e. abelian groups, then simple modules are just simple abelian groups. It’s known that simple abelian groups are the groups that are cyclic of prime order: \Bbb{Z}/p\Bbb{Z}.

Example 3.4 If K is a field, and we consider K[X]-modules, i.e. K-vector spaces equipped with a choice of endomorphism A (cf. the first section of the last entry), then a module is simple iff it doesn’t have a non-zero proper A-invariant subspace. One can show that this equivalent to being isomorphic to K[X]/(f) for some irreducible f \in K[X]. In particular, if K is algebraically closed, then simple K[X]-modules are precisely the one-dimensional ones. (Where the endomorphism necessarily acts by scalar multiplication.) This means that over an algebraically closed field, an endomorphism of a finite-dimensional vector space is diagonalizable, if and only if the associated K[X]-module is a direct sum of simple modules. (And in general, the associated K[X]-module is a direct sum of simple modules if and only if the the endomorphism is diagonalizable over an algebraic closure.) We will encounter the condition of being a direct sum of simple modules later in this post.

Generalizing the last two examples, we have the following result:

Lemma 3.5  If R is any ring and \mathfrak{m} is a maximal left ideal, then R/\mathfrak{m} is a simple module. Conversely, every simple module is of that form

Proof If \mathfrak{m} is a maximal left ideal, then submodules of R/\mathfrak{m} correspond to submodules of R containing \mathfrak{m}, so R/\mathfrak{m} is simple by definition of \mathfrak{m} being maximal.
Conversely, if M is a simple module and m \in M is nonzero, then Rm is a non-zero submodule of M, so M=Rm. This means the map R \to M, r \mapsto rm is surjective so we get an isomorphism R/I \cong M for some proper ideal I. If I is not maximal, then there’s a proper submodule containing I, which corresponds to a proper non-zero submodule of M, which is impossible.

Example/Definition 3.6 If K is a field and G is a group, then similar to example 3.4, simple K[G]-modules are representations with no non-zero proper G-invariant subspace. These are called irreducible representations.

Example 3.7 The representations of a cyclic group of order n, corresponding to irreducible factors of X^n-1 that we have constructed in 2.6. are irreducible. The reason is that they’re irreducible K[X]-modules, where the action of X corresponds to the action of a generator of the group. (cf. 3.4 and the proof of 2.6)

We have seen in 3.5 that simple modules are generated by one element, let’s give this property a label (generalizing the notion of cyclic groups):

Definition 3.8 Modules that are generated by a single element are called cyclic modules.

By our considerations in the beginning of the section, we see that when Maschke’s theorem applies and we have a finite-dimensional representation, it decomposes as a direct sum of irreducible subrepresentations. The purpose of the following lemmas is to generalize this. (Because we work without any finiteness conditions, we will need some form of the axiom of choice. If one is only interested in modules that satisfy some finiteness condition (e.g. finite-dimensional modules for an algebra over a field), then the dependence on choice can be eliminated and the arguments are much easier.)

Definition 3.9 A module M over a ring is called semisimple if every submodule N \leq M has a complement, i.e. there exists a submodule N' \leq M such that M=N\oplus N'

Lemma 3.10 Submodules and quotients of a semisimple modules are semisimple.

Proof If M is semisimple and M/N is a quotient, then for submodule \overline{K} \leq M/N, we can take the preimage under the projection M \to M/N to get a submodule K \leq M that projects onto \overline{K}. Then the image under the projection of a complement of K will be a complement for \overline{K}. If N \leq M is a submodule, then we can find a complement N', but then M/N' \cong N so that N is quotient of M, so the previous case applies.

We will need the following result for an important property of semisimple modules. Most readers will probably be familiar with this, at least in the commutative case:

Lemma 3.11 Let R be a ring. Then every proper left ideal is contained in a maximal left ideal.

Proof Let I be a proper left ideal and let \mathcal{P} be the set of all proper left ideals containing I. If we have an ascending chain (I_{i})_{i \in \Omega}, where I_i \in \mathcal{P}, then \displaystyle \cup_{i \in \Omega} I_i is an upper bound. This is a proper ideal, because if it wasn’t, some I_i would contain 1, which is impossible. So Zorn’s lemma applies and we get a maximal element in \mathcal{P}

Corollary 3.12 Every non-zero cyclic module contains a maximal submodule

Proof Any non-zero cyclic module is of the form R/I where I is a proper left ideal. Now apply 3.11 to I. We get a maximal left ideal \mathfrak{m} contaning I. Then \mathfrak{m}/I is a maximal submodule of R/I.

Lemma 3.13 Any non-zero semisimple module contains a simple submodule.

Proof Let M be a semisimple module over a ring R. As by 3.10, submodules of semisimple modules are semisimple, it suffices to treat the case where M is cyclic. In that case, M contains a maximal submodule N \leq M by 3.12.
As M is semisimple, we can find a submodule S \leq M such that M = N \oplus S.
If S is not simple, then there is a non-zero proper submodule S' \subsetneq S, but then N \subsetneq N \oplus S' \subsetneq N \oplus S = M, which contradicts the maximality of N.

We now come to the main result on semisimple modules, the proof is a little technical.
The most important part of the statement for us is the implication (1)=>(2) (cf. 3.16), but we give the full result for completeness.

Proposition 3.14 For a module M, the following statements are equivalent:

  1. M is semisimple
  2. M is a sum of simple submodules
  3. M is a direct sum of simple submodules

Proof
(1.) implies (2.): Let M be semisimple and let \mathrm{soc}(M) be the sum of all simple submodules, then as M is semisimple, we get that M=\mathrm{soc}(M) \oplus N for some N \leq M.
If N is non-zero, we get that N contains a simple submodule by 3.10 and 3.13, but this contradicts the definition of \mathrm{soc}(M) and the fact that \mathrm{soc}(M) \cap N = 0.

(2.) implies (3.): Suppose M = \sum_{i \in I} M_i where all M_i are simple.
Consider the set of subsets J \leq I such that \sum_{i \in J}M_i = \bigoplus_{i \in J}M_i. This is partially ordered by inclusion and the usual Zorn’s lemma argument works (just take unions of chains as upper bounds) so that we get a maximal element J_{\omega}. Suppose that \bigoplus_{i \in J_\omega} M_i = \sum_{i \in J_\omega} M_i \subsetneq \sum_{i \in I} M_i = M, then for some i_0 \in I, we get that M_{i_0} \not \subset \bigoplus_{i \in J_\omega} M_i, which implies that M_{i_0} \cap  \oplus_{i \in J_\omega} M_i = 0, since M_{i_0} is simple and that intersection is a proper submodule. But then we get
M_{i_0} +  \oplus_{i \in J_\omega} M_i = M_{i_0} \oplus \oplus_{i \in J_\omega} M_i which contradicts the maximality of J_\omega.

(3.) implies (1.): Let M = \bigoplus_{i\in I} M_i with all M_i simple. Let N \leq M be a submodule. We may assume that N is a proper submodule.
Consider the set of subsets J \subset I such that N \cap \bigoplus_{i \in J} M_i = 0. This is non-empty, as N is a proper submodule, it doesn’t contain some M_i, but then N \cap M_i = 0, as M_i is simple.
Now we apply (surprisingly!) Zorn’s lemma to this set, partially ordered by inclusion by taking unions as upper bounds for chains. Let J_\omega be a maximal element.
Then consider N+\bigoplus_{i \in J_\omega} M_i = N \oplus \bigoplus_{i \in J_\omega} M_i. If this is a proper submodule of M, then it must have zero intersection with some M_{i_o} for i_0 \in I. It follows that N \cap M_{i_0} = 0, i_0 \not \in J_\omega and M_{i_0} \cap M_i = 0 for all i \in J_\omega, thus M_{i_0} + \bigoplus_{i \in J_\omega}M_i= M_{i_0} \oplus \bigoplus_{i \in J_\omega} M_i, so that by maximality of J_\omega, we get N \cap (M_{i_0} \oplus \bigoplus_{i \in J_\omega} M_i) \neq 0, so we can choose n non-zero in that intersection. Write n=m+m' for some m \in M_{i_0} and m' \in \bigoplus_{i \in J_\omega} M_i
Then m=n-m' is contained in M_{i_0} \cap (N \oplus \bigoplus_{i \in J_\omega} M_i) which is zero by the choice of i_0.
Thus n=m' is a non-zero element of N \cap \bigoplus_{i \in J_\omega} M_i = 0 which is impossible, thus M=N \oplus \bigoplus_{i \in J_\omega} M_i.

Note that the proof for the implication from (2) to (3) actually shows that if a module is a sum of simple submodules, one can find a subset of the index set such that the sum is direct and still gives the whole module.

Corollary 3.15 Direct sums of semisimple modules are semisimple.

Proof Use the equivalence between (1) and (3) in 3.14

Corollary 3.16 If G is a finite group and K is a field such that the characteristic of K doesn’t divide the order of G, then any representation of G over K is a direct sum of irreducible subrepresentations.

Proof Follows from 1.22 and 3.14

We have already seen an instance of this phenomenon in our study of cyclic groups in the semisimple case. (cf. 2.6 and 3.7)

After this not-quite-simple proposition about semisimple modules, we return to simple properties of simple modules.

Let’s first record an observation, so that the statements we’re about to prove make sense:

Lemma 3.17 Let R be any ring and let M and N be modules over R, then \mathrm{End}_R(M) and \mathrm{End}_R(N) are rings and if R is a K-algebra, they are also K-algebras.
\mathrm{Hom}_R(M,N) is a left module over \mathrm{End}_R(M) and a right module over \mathrm{End}_R(N) and these actions are compatible, i.e. \mathrm{Hom}_R(M,N) is a (\mathrm{End}_R(M),\mathrm{End}_R(N))-bimodule.

Proof The statement might look complicated, but all we’re doing here is just composing maps: \mathrm{End}_R(M) is a ring (or K-algebra) under composition of maps and the module structures on \mathrm{Hom}_R(M,N) are given by composing with endomorphisms from the left or the right. All properties we need follow from properties of composing linear maps

Lemma 3.18 (Schur) Let M and N be modules over a ring R and let f:M \to N be a linear map, then:

  1. If M is simple, then f is either zero or injective.
  2. If N is simple, then f is either zero or surjective.
  3. If both M and N are simple, then f is either zero or an isomorphism.
  4. If M is simple, then \mathrm{End}_R(M) is a division ring. (cf. 3.17)

Proof
(1): As M is simple, \mathrm{ker}(f) is either M or 0.
(2): As N is simple, \mathrm{im}(f) is either N or 0.
(3): Follows from (1) and (2).
(4): Follows from (3).

Despite the easy proof, Schur’s lemma is quite useful and will be a constant companion while dealing with simple modules. We give a first application.

Lemma 3.19 Let M be a semisimple module that is a finite direct sum of simple submodules, write M \cong \bigoplus_{i=1}^n M_i^{e_i} where the M_i are pairwise non-isomorphic. Then for every i, set D_i=\mathrm{End}_R(M_i). Then we have an equality e_i = \mathrm{dim}_{D_i}(\mathrm{Hom}_R(M_i,M))= \mathrm{dim}_{D_i}(\mathrm{Hom}_R(M,M_i)). In particular, the exponent e_i is independent of the decomposition, so the decomposition is unique up isomorphism and permutation of the factors.

Proof  \mathrm{Hom}_R (M_i,M) =\mathrm{Hom}_R(M_i,\bigoplus_{j=1}^n M_j^{e_j}) \cong \bigoplus_{j=1}^n \mathrm{Hom}_R(M_i, M_j)^{e_j}
Note that this isomorphism is D_i-linear, because the action of D_i is given by composition in the first argument.
Schur’s lemma implies \mathrm{Hom}_R(M_i,M_j) = 0 unless i=j, so we get \bigoplus_{j=1}^n \mathrm{Hom}_R(M_i, M_j)^{e_j} \cong \mathrm{Hom}_R(M_i, M_i)^{e_i}= D_i^{e_i}. The case with switched arguments works the in the same way.

Corollary 3.20 Let G be a finite group and let K be a field such that the characteristic of K does not divide the order of G, then every finite-dimensional representation can be written as a direct sum of irreducible subrepresentations which are uniquely determined up to isomorphism, including their multiplicity.

Proof Existence follows from 3.16 and uniqueness from 3.19

The last corollary justifies why one pays a lot of attention to irreducible representations, especially when Maschke’s theorem applies.

Semisimple Rings and Algebras

So far, we have just studied (semi)simple modules. A general philosophy in ring theory is to study relations between the internal structure of a ring and the structure of its modules. Whenever there’s a notion for modules, one possible definition for a ring-theoretic property is obtained by just considering a ring as a left, right or two-sided module over itself. (For technical reasons, we will work with right ideals and modules in this section. It will allow us to skip some passage from a ring to its opposite ring in a future post. One can dualize all statements by using that left R-modules are right R^{op}-modules, where R^{op} is the opposite ring which has reversed order of multiplication. Note that the group algebra K[G] is isomorphic to its own opposite ring, via the map given on the basis G by g \mapsto g^{-1}.)

If we apply this to the properties we’ve been studying, we get that a ring that is simple as a right module over itself is just a division ring. The way to see this is that every non-zero element must generate the whole ring as a right deal (and by group theory it’s enough to have all right inverses.). We already have a name for that, so that’s nothing new. This doesn’t happen with the following definition:

Definition 3.21 A ring R is called semisimple if it is semisimple as a right module over itself.

We’re deliberately not being careful with the chirality here: Theoretically, one should define left and right semisimple, but as we shall see that they are equivalent.

We can apply the theory we have developed for semisimple modules to show how this property is reflected in the structure of the modules over a ring:

Lemma 3.22 A ring R is semisimple if and only if all right modules over R are semisimple.

Proof One direction is obvious. For the other one, note that if R is semisimple, 3.15 implies that all direct sums of R, i.e. all free modules are semisimple. By 3.10, this also shows that all quotients of free modules are semisimple. But every module is a quotient of a free module.

Remarkably, this tells us that it would have been sufficient to prove Maschke’s theorem for just for one single representation, the one corresponding to the K[G]-module K[G] to get decompositions into irreducible representations. (Even infinite-dimensional ones.)

Lemma 3.23 If a ring is a direct sum of non-zero right ideals, then the sum is finite.

Proof Suppose R=\bigoplus_{i \in I} J_i, then we have 1=(a_i)_{i \in I} where all but finitely many a_i are zero. Let I' \subset I be the subset of I consisting of the indices i such that a_i \neq 0. Then for any r \in R, we have r= 1 \cdot r = \sum_{i \in I'} a_ir because the sum is direct, this expression is the unique way to write r as a sum from elements in J_i where i ranges over I. since we assumed that all J_i are non-zero, this implies I=I', so that I is finite.

Corollary 3.24 A semisimple ring is a finite direct sum of simple right R-modules (also called minimal right ideals in this case.)

Proof Apply 3.14 and then 3.23.

Corollary 3.25 Let R be a semisimple ring, then every simple right R-modules M_i occurs as a direct summand of R (as a right R-module over itself) and the multiplicity is equal to the dimension of M_i over its endomorphism ring (which is a division ring). In particular, that dimension is finite.

Proof Note that 3.24 implies that 3.19 is applicable to R (by which we always mean as a right module over itself in this proof).
Let e_i be the multiplicity with which M_i occurs in the decomposition of R as a direct sum of simple submodules. By 3.19 e_i is independent of the decomposition, but it might be zero. But 3.19 tells us that e_i=\mathrm{dim}_{D_i}(\mathrm{Hom}_R(R,M_i))=\mathrm{dim}_{D_i}(M_i) which also tells us two things:
1) The RHS is finite
2) The LHS is non-zero, as M_i \neq 0.

We want to apply this to the case where R is an algebra over a field K, but for this it would be nice to know that the D_i are finite-dimensional over K. We need some easy results on finiteness conditions.

Lemma 3.26 Let K be a field and let M and N be modules over a finite-dimensional algebra A, then if M and N are finitely-generated over A, they are finite-dimensional over K and so is \mathrm{Hom}_A(M,N).

Proof M being finitely-generated means that we can find a A-linear surjection A^n \to M. As A is a K-algebra, this surjection is also K-linear. A^n is finite-dimensional over K, because A is, this implies that M is finite-dimensional over K. If M is finitely generated, let S be a finite-generating system, then the map \mathrm{Hom}_R(M,N) \to N^S, f \mapsto (f(s))_{s \in S} is K-linear. It is also injective, because any map from M is determined by where it sends the generating system S. N^S is a finite-dimensional vector space by the previous part, thus \mathrm{Hom}_R(M,N) is finite-dimensional.

Corollary 3.27 Let A is a finite-dimensional algebra over a field K, then all simple modules and their endomorphism rings are finite-dimensional over K.

Lemma 3.28 Let A be a semisimple algebra over a field and let M_1, \dots, M_n be a list of all simple modules, up to isomorphism. Let D_i=\mathrm{End}_A(M_i) be their endomorphism rings. Then \mathrm{dim}_K(A)=\sum_{i=1}^n \mathrm{dim}_K(M_i)^2/\mathrm{dim}_K(D_i) (where all dimensions are finite.)

Proof 3.25 implies that A \cong \bigoplus_{i=1}^n M_i^{e_i} where e_i=\mathrm{dim}_{D_i}(M_i), this implies that \mathrm{dim}_K(A)= \sum_{i=1}^n \mathrm{dim}_K(M_i)\mathrm{dim}_{D_i}(M_i). (3.27 tells us that we don’t have to worry about infinite dimension.)
So the only thing left to show is that \mathrm{dim}_K(D_i) \mathrm{dim}_{D_i}(M_i)=\mathrm{dim}_K(M_i). But this is clear: We have M_i=D_i^{e_i}, so we just compare the K-dimension of both sides.

The following lemma tells us that we can leave out the factors if \mathrm{dim}_K(D_i) if K is algebraically closed.

Lemma 3.29 If K is an algebraically closed field, then every finite-dimensional division algebra over K is one-dimensional, i.e. K itself.

Proof Let D be finite-dimensional division algebra over K.
Let d \in D, then consider the K-subalgebra K[d] generated by d. Every element in K[d] is a polynomial in d, so K[d] is a quotient of the polynomial ring K[x] via the map ev_x: x \mapsto d. But then \mathrm{ker}(ev_x) is a non-zero prime ideal, as the image is finite-dimensional over K and doesn’t contain zero divisors. This implies \mathrm{ker}(ev_x)=(x-\lambda) for some \lambda \in K, so d=\lambda \in K.

We note the following corollary to 3.28 and 3.29

Corollary 3.30  Let A be a semisimple algebra over a field and let M_1, \dots, M_n be a list of all simple modules, up to isomorphism. Then \mathrm{dim}_K(A)\leq \sum_{i=1}^n \mathrm{dim}_K(M_i)^2 and we have equality if K is algebraically closed.

If we put in some knowledge about finite-dimensional divison algebras over \Bbb R (namely, the fact that the only ones are \Bbb R, \Bbb C, \Bbb H, so the dimension is at most 4), we also get the following:

Corollary 3.31 Let A be a semisimple algebra over \Bbb R and let M_1, \dots, M_n be a list of all simple modules, up to isomorphism. Then \frac{1}{4} \sum_{i=1}^n \mathrm{dim}_\Bbb{R}(M_i)^2 \leq \mathrm{dim}_\Bbb{R}(A)\leq \sum_{i=1}^n \mathrm{dim}_\Bbb{R}(M_i)^2

These corollaries translate into statements about representations when we apply them to the group algebra K[G].

Let’s close this post by recapitulating what we have shown about representations in the case where G is finite and the characteristic of K doesn’t divide the order of G:

  • There are finitely many irreducible representations up to isomorphism(which are all finite-dimensional)
  • Every irreducible representation occurs as a direct summand of the so called “regular representation”, which is the representation corresponding to K[G] as a module over itself.
  • Every representation is a direct sum of copies of irreducible subrepresentations, even infinite-dimensional ones.
  • We know that for finite-dimensional representations the decomposition into a direct sum of irreducibles is unique up to isomorphism of the factors, including multiplicities. (I didn’t want to deal with cardinals for the infinite-dimensional case)
  • We have a nice formula that relates the dimension of irreducibles, the dimension of their endomorphism rings and the order of G (which is obviously the dimension of K[G]). If we don’t want to talk about the endomorphism rings, we still have an inequality, which is an equality in the algebraically closed case.

In the next post, we will continue our study of semisimple rings and give applications, e.g. by describing the number of irreducible representations in terms of the group G.

The Group Algebra: Another Perspective on Representations

In this post, we introduce the group algebra as another way to view representations and illustrate the usefulness of this approach by studying representations of cyclic groups by elementary ring theory. This is the second part of a series that started with this post. The numbering is consecutive, i.e. when I refer to some result 1.x, then this is written in that post.

A Review of Modules in Linear Algebra

We will begin by reviewing what modules over the polynomial ring K[X] mean in terms of linear algebra, as this will be helpful for motivating the module-theoretic perspective on representations.

Let K be a field and V be a (say finite-dimensional) vector space over K and let A be a K-linear endomorphism of V (so after choosing a basis, we can think of A as a square matrix.) Suppose we wish to understand A, e.g. find a basis such that A has a particularly nice matrix representation with respect to that basis.

From the pair (V,A), we can define a K[X]-module structure on V by defining K[X]-scalar multiplication via (\sum_{i=0}^n \lambda_i X^i)v= \sum_{i=0}^n \lambda_i A^i(v), where A^i(v) means that we apply A to v i times.

Conversely, given a K[X]-module M, we can think of it as a K-vector space V=M by restricting the scalar multiplication to K \subset K[X]. We also get a K-linear endomorphism of V given by multiplication with X.

These constructions are inverse to each other: Going from a pair (V,A) to the associated K[X]-module, multiplication with X is precisely the endomorphism we started with.
For a K[X]-module, because every polynomial is a linear combination of powers of X, we only need to know K-scalar multiplication and how X acts to reconstruct the K[X]-scalar multiplication.

Thus, we can think of pairs (V,A) of vector spaces equipped with an endomorphism as K[X]-modules and we can translate between notions for endomorphisms and notions for K[X].

Here’s an excerpt of a possible dictionary one might use for translation:

Pairs (V,A) of vector spaces and endomorphisms K[X]-modules M
Subspaces W that are invariant under A, i.e. A(W) \subset W K[X]-submodules
For pairs (V,A), (W,B), a K-linear map f:V \to W such that f \circ A = B \circ f K[X]-linear maps
For pairs (V,A), (W,B), a K-linear isomorphism f:V \to W such that A =f^{-1} \circ B\circ f K[X]-linear isomorphisms
Eigenspace of A associated to \lambda The submodule of M consisting of all elements annihilated by X-\lambda
The minimal polynomial of A The unique monic generator of the annihilator ideal associated to M, i.e. the unique monic polynomial P \in K[X] of minimal possible degree such that P(v)=0 for all v \in V

One could add many more rows.

The important part for finding e.g. a nice basis for A is the third row: If we can find a module that is isomorphic to the K[X]-module associated to (V,A) such that we write down a basis such that we get a nice matrix representation for multiplication with X, then the third row tells us that this is also a possible matrix representation for A!

Now as K[X] is a PID, finitely generated modules over K[X] (in particular those modules that are finite-dimensional over K) are very well-understood. There’s a structure theorem that tells us that they are finite direct sums of modules of the form K[X]/(f), where f \in K[X]. (One can also put some conditions of f to make them unique.) From there, one can easily deduce existence and uniqueness of canonical forms such as the Jordan normal form, the Frobenius normal form, and also properties of the minimal polynomial such as Cayley-Hamilton.

The Group Algebra

Let G be a group and V be a representation of G over a field K. By abuse of notation, we denote the action of an element g \in G on a vector v by gv. Let \lambda \in K, g \in G and v \in V, then there seems to be only one sensible way to define what we mean by (\lambda+g)v, clearly, this has to be \lambda v + gv if any sensible rules hold.
But what is the expression \lambda+g supposed to mean? After all, we can’t just add an element of a group and an element of a field.

Or can we?

Definition 2.1 Let G be a group (we denote the neutral element by 1) and K be a field, then the group algebra K[G] is defined as the vector space over K freely generated by the elements of G. We denote the elements of the basis corresponding to elements in G by the same symbols. Group multiplication G \times G \to G defines a multiplication of basis elements that we extend linearly in each argument. This defines a multiplication K[G] \times K[G] \to K[G] that extends the multiplication of G and makes K[G] into a K-algebra.

We leave the verification of the ring axioms to the reader. Intuitively, the group algebra K[G] consists of finite formal linear combinations of elements in G, i.e. we can write them as \sum_{g\in G} \lambda_g g where all but finitely many coefficients vanish. To compute a product of two such expressions, we expand using distributivity and then use the group multiplication to multiply the products of the basis vectors.

Lemma 2.2 For every representation V of G over K, there is a unique way to define a K[G]-module structure that extends the given group action and K-scalar multiplication, conversely, every K[G]-module gives rise to a representation in a canonical way. Using this identification, a morphism of representations corresponds precisely to a K[G]-linear map.

Proof Given a representation V, v \in V and an element \sum_{g \in G}\lambda_g g \in K[G] (this means the sum is finite), the only possible way to define (\sum_{g \in G}\lambda_g g)v such that the module axioms hold and K \subset K[G] acts via the given scalar multiplication and G \subset K[G] acts via the group action is to set (\sum_{g \in G}\lambda_g g)v=\sum_{g \in G} \lambda_g(g(v)).
Note that the RHS is just defined in terms of the group action and the vector space structure. One checks that this defines a module structure, using the linearity of the group action.
Conversely, if we have a K[G]-module V, we can turn V into a K-vector space by restricting scalars to the subring K \subset K[G].
We can also restrict the scalar multiplication to the subset G \subset K[G]. Associativitiy and unitality in the axioms for a module imply that this defines a group action.
Finally, group action and vector space operations are compatible due the equality g\lambda = \lambda g in K[G] and associativity and distributivity of the scalar multiplication.
The correspondence of morphisms of representations and K[G]-linear maps follows by similar arguments: the idea is that every element in K[G] is a K-linear combination of elements in G, so it’s enough that a map commutes with K-scalar multiplication and the G-action to see that it preverses K[G]-scalar multiplication.

The relation between the description of linear-algebraic objects as K[X]-modules and representations as K[G]-modules is as follows:
In the former, we looked at any endomorphism without any condition, but only at one endomorphism at a time, that’s why the K-algebra of choice to describe such an object is the polynomial algebra K[X] which is generated freely by one element X, i.e. we don’t impose any relation.
For group representations, we consider many endomorphisms (actually automorphisms) at once, subject to all the relations that hold in the group G. That’s why K[G] isn’t necessarily generated by one element and by inheriting the multiplication from G, K[G] also inherits all the relations between elements in G.

With lemma 2.2, we have added another characterization of representations to our collection (cf. lemma 1.3).
If one is really careful with the constructions in the lemma, one sees that it defines an isomorphism of categories which is just a formalization of the inuition that representations and K[G]-modules are exactly the same, just with a different point of view.

Let’s also mention the universal property of the group algebra, which can be quite useful even if you’re not a category-aficionado and implies lemma 2.2 as a special case.

Lemma 2.3 Let K be a field and G be a group, then for any K-algebra A and every group homomorphism \varphi:G \to A^\times to the group of units, there is a unique K-algebra homomorphism K[\varphi]:K[G] \to A that extends \varphi.

Proof The same argument as in lemma 2.2 applies: The only way we can define an extension that is K-linear is by sending \sum_{g \in G} \lambda_g g to \sum_{g \in G} \lambda_g \varphi(g), this just follows from the fact that G is a basis for K[G]. One checks that because \varphi is a group homomorphism and the multiplication in K[G] is inherited from G, this also respects the unit element and multiplication, so it is a K-algebra homomorphism.

To see how this implies one direction in lemma 2.2, note that for a vector space V, the endomorphism ring \mathrm{End}_K(V) is a K-algebra (here multiplication is composition) and for the group of units, we get \mathrm{End}_K(V)^\times=\mathrm{GL}(V).
Thus if we have a group homomorphism \varphi:G \to \mathrm{GL}(V), the universal property tells us that there is a unique K-algebra homomorphism K[\varphi]: K[G] \to \mathrm{End}_K(V). Now we can define a module structure by uncurrying:
Define for r \in K[G] and v \in V, r \cdot v:=K[\varphi](r) (v)
The fact that K[\varphi] is a ring homomorphism translate neatly into the module axioms and K-linearity gives us that the K-scalar multiplication on V remains the same.

Having this perspective is quite useful, because there are a lot of constructions for modules that now carry over directly to representations: we can form direct sums and products of representations, quotients etc. and all the properties of those constructions that we know to hold for modules also hold in this case. For example, subrepresentations as defined in 1.18 are the same as K[G]-submodules.

Representations of Cyclic Groups

We will use the accessible example of cyclic groups to show how the structure of the group algebra contains information about representations.

Lemma 2.4 If G=\langle g \rangle is cyclic of order n, then K[X]/(X^n-1) \cong K[G], where the isomorphism sends X to g.

Proof One can take the map K[X] \to K[G] that sends X to g and compute the kernel. As g generates G, so every element in K[G] is a polynomial in G, which implies the surjectivity of that map.
Let’s instead show that both satisfy the same universal property:

  • If A is any K-algebra, then a K-algebra homomorphism K[G] \to A corresponds to a group homomorphism G \to A^\times by lemma 2.3.
    Since G=\langle g\rangle is generated by g, a group homomorphism is uniquely determined by where it sends g. As g has order n, we can send it precisely to those elements a \in A such that a^n=1 (This condition automatically give us that a\in A^\times). Thus K[G] has the following universal property for this choice of G:
    For any K-algebra A and every element a \in A such that a^n=1, there’s a unique K-algebra homomorphism K[G] \to A that sends g to a.
  • If A is still any K-algebra, then by the homomorphism theorem, a K-algebra homomorphism K[X]/(x^n-1) \to A is the same as a K-algebra homomorphism K[X] \to A that sends x^n-1 to 0. A K-algebra homomorphism K[X] \to A is uniquely determined by where it sends X and we can send it to every element in a \in A, but due to the condition that x^n-1 must be sent to 0, for homomorphisms from K[x]/(X^n-1), we can send X to precisely those elements a \in A such that a^n=1.
    Thus we have proved:
    For any K-algebra A and every element a \in A such that a^n=1, there’s a unique K-algebra homomorphism K[X]/(X^n-1) \to A that sends X to a.

At this point, to finish the proof, one can either mumble something about the Yoneda lemma with a smug expression or one can make the usual argument why two objects with the same universal property are isomorphic. (This should be familiar to anyone who has seen e.g. why the tensor product is unique)
Let’s do the latter: Because of the universal property of K[X]/(X^n-1), we can find a unique K-algebra homomorphism \varphi:K[X]/(X^n-1) that sends X to g. This also works in the other direction: we get a unique K-algebra homomorphism \theta: K[G] \to K[X]/(X^n-1). Then \varphi \circ \theta is a K-algebra homomorphism K[G] \to K[G] that sends g to itself.
By the universal property, there can only be one such homomorphism, but we know that the identity is an example. Therefore \varphi \circ \theta = \mathrm{id}.
By the same argument, \theta \circ \varphi = \mathrm{id}.

Exercise Do a similar argument to determine the group algebra K[G] where G is a product of two cyclic groups as a quotient of the polynomial ring in two variables. Why can this approach not work in this form for nonabelian groups?

We can use this to describe the representations of cyclic groups by decomposing K[X]/(X^n-1) with the Chinese remainder theorem. If we do that, we will end up with a product of rings, which is one of the reasons why it’s useful to think about modules over products of rings. If R and S are rings, then for every pair (M, N) where M is a R-module and N is a S-module, we can make M \oplus N into an R \times S-module by having R act on the left factor and S on the right factors. The following lemma tells us that every R \times S-module arises in such a way:

Lemma 2.5 If R and S are rings, then every R\times S-module is isomorphic to a direct sum M \oplus N where M is a R-module and N is a S-module such that R\times \{0\} \subset R \times S just acts on the first factor and \{0\} \times S \subset R \times S just on the second one.
M and N are canonically determined.

Proof Let X be a T= R \times S-module. Consider the central idempotents e:=(1,0), f:=(0,1). Then Te=eT=R \times \{0\}, Tf=fT=\{0\} \times S and Te \cap Tf= \{0\}, Te+Tf=T. Then set M=eX and N=fX, we get that eX \cap fX = 0, eX+fX=X, so X = eX \oplus fX = M \oplus N. It’s clear that S acts trivialy on eX and R acts trivially on fX which shows the statement. We can use the same e and f for all modues, which makes this decomposition canonical.

(Note: The above construction can be enhanced into a category equivalence of (R\times S)\textrm{-}\mathbf{Mod} and R\textrm{-}\mathbf{Mod} \times S\textrm{-}\mathbf{Mod})

Now let’s finally describe the representations of cyclic groups over \mathbb{C}!

If G is cyclic of order n, generated by g, then by Lemma 2.4, \mathbb{C}[G] \cong \mathbb{C}[X]/(X^n-1), where the isomorphism sends g to X. Let \zeta_n = \exp(2\pi i/n), then we have the factorization X^n-1=\prod_{k=0}^{n-1}(X-\zeta_n^k), so by the Chinese remainder theorem, we get
\mathbb{C}[X]/(X^n-1) \cong \prod_{k=0}^{n-1} \Bbb{C}[X]/(X - \zeta_n^k)
Note that this is an isomorphism both of rings, but also of K[X]-modules, which means that we send X to X in each component.

Therefore, by lemma 2.5, every \mathbb C[G]-module is a direct sum of \Bbb{C}[X]/(X - \zeta_n^k)-modules where k varies. But for each k, we have \Bbb{C}[X]/(X-\zeta_n^k) \cong \Bbb{C} via sending X to \zeta_n^k. Modules over a field F are easy to understand: they are just a (possibly infinite) sum of F. Thus we get that every \Bbb{C}[X]/(X-\zeta_n^k)-module is a direct sum of copies of \Bbb{C}[X]/(X-\zeta_n^k) \cong \Bbb{C}.

Through all the isomorphisms, we have kept track where the generator g is sent: we send it first to X, then X to X (modulo something different in the CRT isomorphism), then X to \zeta_n^k. This means that g acts on the (one-dimensional) \mathbb{C}[G]-module corresponding to \Bbb{C}[X]/(X-\zeta_n^k) by multiplication with \zeta_n^k. And in general, all modules are a direct sum of such modules (for different k), this means that G really acts as a diagonal matrix where all the diagonal entries are n-th roots of unity. (Even for infinite-dimensional representations.)

We can also say something about a more general setting. Suppose that the characteristic of K does not divide n. Then X^n-1 has distinct roots, so we can factorize X^n-1=f_1 \cdot f_k where all f_i are irreducible and pairwise distinct. Doing the same Chinese remainder theorem argument we get that:

K[G] \cong \prod_{i=1}^k K[X]/(f_i). Now as the f_i are irreducible, K[X]/(f_i) will be a field and the dimension over K will be equal to the degree of f_i, so we can again appeal to linear algebra and get the following result:

Lemma 2.6 Let G be a cyclic group of order n and let K be a field such that the characteristic of K does not divide n and let X^n-1=f_1 \cdot \ldots \cdot f_k be the factorization of X^n-1 into irreducibles. Then for every i, There is a K[G]-module corresponding to f_i such that the dimension of the module is equal to the degree of f_i and in general, every K[G]-module is a direct sum of such modules.

Example 2.7 For K=\mathbb{R}, the only factors that can occur for X^n-1 are (X-1),(X+1) and quadratic factors of the form X^2-2\cos(2\pi k/n)+1. By choosing a clever basis for the modules corresponding to the quadratic factors, one obtains rotation representations as in example 1.6 (though the angle of rotation will be 2 \pi k/n instead of 2 \pi/n in the example.)

Example 2.8 For K=\mathbb{Q}, the factorization of X^n-1 is well-known, it factors as X^n-1=\prod_{d \mid n} \Phi_d(X), where \Phi_d(X) is the d-th cyclotomic polynomial. Thus the number of irreducible representations of G over \Bbb{Q} is equal to the number of divisors of n and for each divisor d, there’s an irreducible representation of degree \varphi(d).

We can also use this approach to say something about representations of cyclic groups in the case where the characteristic divides the group order. For simplicity, we just treat the case that K has characteristic 2 and G is cyclic of order 2. The factorization of X^2-1 is just (X-1)^2 and we get
K[G] \cong K[X]/(X-1)^2. Here the Chinese remainder theorem doesn’t help.
But one can apply the structure theorem for finitely generated modules over a PID mentioned in the first section, noting that every K[X]/(X-1)^2-module is also a K[X]-module to get that every finitely generated K[X]/(X-1)^2-module is a direct sum of copies of K[X]/(X-1) and K[X]/(X-1)^2. If we look at the action of X (which corresponds to the generator of G) on these modules, we see that it acts by multiplication with 1 on K[X]/(X-1), i.e. via the identity (we say “trivially”).
The action on K[X]/(X-1)^2 is more interesting: Using the basis given by (the residue classes of) 1 and X-1, we see that the action of X corresponds to a transvection action g\begin{pmatrix}a\\b\end{pmatrix}=\begin{pmatrix}a+b\\b\end{pmatrix} where g is a generator of G (cf. example 1.7)

 

We have seen how the algebraic structure of the group algebra can help to understand representations and in our example of cyclic groups, it turned out that when the conditions for Maschke’s theorem are satisfied, the group algebra is a product of fields.
We will investigate the structure of the group algebra in more detail in future posts and see that this was not a coindidence.

A First Impression of Group Representations

This blog post provides mostly some motivation, basic definitions and examples for group representations, up to Maschke’s theorem. Only familiarity with linear algebra and elementary group theory is required for understanding the main part of this post. However, there are some examples for readers with more background which can be safely ignored. The same is true for all categorical remarks.

This will be the first part of a series.

Introduction

The reason to care about groups is because they act on objects. Group actions arise in many different contexts and can provide insight (into the objects as well as into the groups which act on them).

The basic definition of a group action is an action on a set. A set can be thought of as an object without any structure except size (i.e. cardinality). For finite sets, this is more structure than it might seem: we can use counting and divisibility arguments which leads to results such as strong results on p-groups such as the Sylow theorems, the easy-to-prove but ubiquitous orbit-stabilizer theorem or Burnside’s lemma, which has nontrivial combinatorial implications.

Group actions are everywhere in pure mathematics and frequently the object which is acted upon has more structure than just being a set, e.g. a topological space. In these situations, the natural thing to investigate (or to require, if you’re writing a definition) is some compatibility between the group action and the structure on the object. In the example of a topological space, one would require the action to be continuous.

A common structure are vector spaces. Their usefulness seems to stem from the fact that they are both very well understood and still give one a lot of tools to work with: we have duals, tensor products, traces, determinants, eigenvalues etc. Thus, a technique that is used sometimes is “linearization” in which one tries to reduce a problem to linear algebra or at least gain insight by using linear-algebraic methods. Examples are the tangent space of a smooth manifold or variety (and for smooth maps the derivative) or the linearization of a nonlinear ODE.

Representations of groups can be thought of as a linearization of group theory, or more precisely as a linearization of group actions: One can be define them as actions of groups on vector spaces that respect the linear structure. They arise with an ubiquity comparable to (nonlinear, merely “set-theoretic”) group actions. When they do, they are arguably even more useful, since vector spaces have a lot more structure than sets, as described in the previous paragraph.

Group Representations

Definition 1.1 Let K be a field. Then for a group G, a (K-linear) representation on a K-vector space V is a map \varphi: G \times V \to V that satisfies:

  • \forall v \in V: \varphi(e,v)=v where e\in G is the neutral element.
  • \forall g,h \in G, v \in V: \varphi(gh,v)=\varphi(g,\varphi(h,v))
  • \forall g \in G, v,w \in V : \varphi(g,v+w)=\varphi(g,v)+\varphi(g,w)
  • \forall g \in G, \lambda \in K, v \in V: \varphi(g,\lambda v)=\lambda \varphi(g,v)

Definition 1.2 In the above situation, the degree or dimension of the representation is the dimension of V over K.

Note that the first two axioms state that a representation is a group action and the other two axioms state that for any fixed g \in G, the map V \to V, v \mapsto \varphi(g,v) is K-linear. This map is also invertible, since the axioms imply that v \mapsto \varphi(g^{-1},v) is an inverse. By the second axiom, we also get that the map G \to \mathrm{GL}(V), g \mapsto (v \mapsto \varphi(g,v)) is a group homomorphism. So by using a currying argument, we have seen that every group representation gives rise to a group homomorphism G \to \mathrm{GL}(V).

Conversely, given a group homomorphism \rho:G \to \mathrm{GL}(V), we can uncurry to get a representation on V by setting \varphi(g,v) := \rho(g)(v)

Thus we have another characterization, which allows us to apply concepts defined for group homomorphisms like kernel/image etc, whereas the first characterization allows us to apply notions defined for group actions.

While we’re at it, we may as well add a rephrasing in categorical language of the last characterization and get the following

Lemma 1.3  Let K be a field, then for a group G and a vector space V over K, the following data are equivalent:

  • A representation of G on V as in definition 1.1
  • A group homomorphism G \to \mathrm{GL}(V)
  • A covariant functor from G considered as a one-object category to the category K\textrm{-}\mathbf{Mod} of K-vector spaces that sends the single object in G to V

This is the direct analog of equivalent characterizations of group actions: we can also view them as homomorphisms to the group of permutations of a set or as functors from a group to the category of sets.

(Viewing a group as a category works like this: Let G be a group, then define a category \mathcal{C} with a single object * and set \mathrm{Hom}_{\mathcal{C}}(*,*)=G. Composition \mathrm{Hom}_{\mathcal{C}}(*,*) \times \mathrm{Hom}_{\mathcal{C}}(*,*) \to \mathrm{Hom}_{\mathcal{C}}(*,*), i.e. G \times G \to G is just group multiplication. Associativity and having an identity follows from the group axioms.)

The last characterization allows one to apply constructions from category theory, such as composing representations with other functors, but we will not use it in this post except as an alternative descrition.

Remark 1.4 One can also consider representations of monoids or the case where K is any ring and V is a module.

Remark 1.5 We have defined only left representations. Reversing the chirality in the definitions is straightforward and gives rise to the notion of right representations.

Now for some examples.

As a zeroth example, note that representations of the trivial group are just vector spaces.

Example 1.6 Let G=\langle g \rangle be a finite cyclic group of order n with a fixed generator g, then a homomorphism from G to any group H is determined by where it sends g and we can send g to precisely those elements h \in H such that h^n=1, so a representation of G is just a linear automorphism that satisfies this equation.
For K=\mathbb{R}, one can take for example a rotation matrix \begin{pmatrix} \cos(2\pi/n) & -\sin(2\pi/n) \\ \sin(2\pi/n) & \cos(2\pi/n) \end{pmatrix} to define a two-dimensional representation of G.
For K=\mathbb{C} one can take \zeta_n=\exp{2\pi i/ n} \in \mathbb{C}^\times = \mathrm{GL}_1(\mathbb{C}) to get a one-dimensional representation of G. Both representations correspond to having g^k act by a counterclockwise rotation of 2\pi k/n degrees. The former can be obtained from the latter by restricting scalars from \mathbb{C} to \mathbb{R}.

Example 1.7 Let G=K, the additive group of K, then G acts on V= K^2 via transvections: \phi\left(\lambda,\begin{pmatrix}a\\b\end{pmatrix}\right)=\begin{pmatrix}a+b\lambda\\b\end{pmatrix}

Example 1.8 If X is a G-set (a set equipped with an action from G), then we can consider a vector space V such that the basis elements e_x are indexed by X. We can then define the action on the basis elements by setting \varphi(g,e_x)=e_{gx}. This permutation of the basis elements extends uniquely to a linear automorphism of V and we get a respresentation, called the permutation representation associated to the group action.
From a categorical standpoint, if we view G-sets as functors G \to \mathbf{Set} and representations as functors G \to  K\textrm{-} \mathbf{Mod}, then this construction is just composing a group action with the free module functor \mathbf{Set} \to K\textrm{-} \mathbf{Mod}. (Thus this construction is also functorial with the notion of morphisms of representations to be defined later.)

Example 1.9 Let V be any K vector space. Let G=\mathrm{GL}(V). Then the identity G \to \mathrm{GL}(V) defines a representation. This corresponds to the natural action of G on V that comes from the definition of G as K-linear automorphisms.

Example 1.10 Generalizing the last example, all classical matrix groups such as \mathrm{SL}_n, \mathrm{O}_n, \mathrm{U}_n, \mathrm{Sp}_{2n} etc. are defined as subgroups of some general linear group, so that the subgroup inclusion defines a representation.

Example 1.11 Let G be a finite group and let p be a prime number. Suppose H is a normal abelian subgroup of exponent p. Then H is a K=\mathbb{F}_p vector space and the conjugation action of G on H is \mathbb{F}_p-linear (this is automatic: any group homomorphism between vector spaces over \mathbb{F}_p is \mathbb{F}_p-linear.), thus we obtain a \mathbb{F}_p-linear representation.

Example 1.12 Let L/K be a Galois extension. Then G= \mathrm{Gal}(L/K) acts by definition on L by K-linear field automorphisms. We can just forget the “field automorphism” part and consider V=L just as a K-vector space, then we get a representation \mathrm{Gal}(L/K) \to \mathrm{GL}_K(L).
If L and K are number fields with rings of integers \mathcal{O}_K and \mathcal{O}_L and \mathfrak{p} is a non-zero prime ideal in \mathcal{O}_K, then then G acts on \mathcal{O}_L/\mathfrak{p}\mathcal{O}_L, giving a \mathcal{O}_K/\mathfrak{p}-linear representation.

Example 1.13 Let G=S_n and let W be any vector space over K, then G acts on V = W \otimes_K \dots \otimes_K W where we tensor n copies of W. Then G acts on V by permuting the components of the tensors. Explicitly, if w_1 \otimes w_2 \otimes \dots \otimes w_n is an elementary tensor, then we can define the result of the action of \sigma \in S_n on that to be w_{\sigma(1)} \otimes w_{\sigma(2)} \otimes \dots \otimes w_{\sigma(n)}.

Example 1.14 Let A be a finite-dimensional K-algebra, then the group of units G=A^\times acts on A by conjugation and this action is K-linear, and thus we obtain a representation of the unit group on the underlying vector space of the algebra. (If K=\mathbb{R} and A=\mathbb{H}, then a suitable restriction of the domain and codomain of this representation gives a description of the Hopf fibration.)

Example 1.15 In the same spirit as the last example, let G be an algebraic group over K. Let V= \mathfrak{g}=T_e(G) be the Lie algebra of G. For any g \in G, the conjugation map c_g:G \to G, a \mapsto gag^{-1} is a smooth automorphism of G, so we can take the derivative at the identity and get a linear automorphism \mathrm{ad}(g)=D_e(c_g): V \to V. The map g \mapsto \mathrm{ad}(g) \in \mathrm{GL}(V) is a representation, called the adjoint representation of G. (The same construction works verbatim for Lie groups.)

Example 1.16 Let M be a smooth connected manifold and let \pi:E \to M be a vector bundle with a flat connection. Let x \in M be a base point and set G=\pi_1(M,x) and K=\mathbb{R}.  If we take a smooth loop \gamma: S^1 \to M based at x, parallel transport along that loop defines an automorphism of V=T_xM.
The flatness condition implies that this automorphism depends only on the homotopy class of \gamma and by smooth approximation, every homotopy class of continuous loops may be represented by a smooth loop, thus we obtain the holonomy representation \pi_1(M,x) \to \mathrm{GL}(T_xM).  It turns out that this representation uniquely determines the flat bundle.

As is common practice with group actions, if \rho:G \to \mathrm{GL}(V) is a representation, we also write just gv for \rho(g)v or g for the map \rho(g). By further abuse of notation, we will also just call V a representation of G where the action is clear from the context.

Definition 1.17 If V and W are K-linear representations of a group G for some field K, then a morphism of representations (also called intertwining operator) from V to W is a K-linear map f:V \to W such that \forall g \in G, v \in V: f(gv)=gf(v). (i.e. f is G-equivariant.)

Note that if we consider representations as functors, then a morphism of representations is just a natural transformation. Indeed, for any g \in G, naturality with respect to g as a morphism is precisely the requirement that f(gv) = gf(v) for all v.

Example 1.17 In the situation of example 1.13, let f \in \mathrm{GL}(W), then we can define f^{\otimes n} by acting on each factor: f^{\otimes n}(v_1 \otimes \dots \otimes v_n)=f(v_1) \otimes \dots \otimes f(v_n) for an elementary tensor. Since we act in the same way in each component, this commutes with permutation of the factors, thus f^{\otimes n}: V \to V defines a morphism of the representation of S_n  given by permuting the factors in the tensor product.

Definition 1.18 If V is a representation of G, then a subspace W that is G-invariant (i.e. gW \subset W for all g \in G) defines again a representation of G. These subspaces are called subrepresentations of V.

If W is a subrepresentation of V, then the inclusion is a morphism of representations, which gives a (quite general) family of examples for morphisms.

Example 1.19 In the situation of example 1.7, consider the subspace of K^2 spanned by \begin{pmatrix}1\\0\end{pmatrix}, this is a subrepresentation because \varphi\left(\lambda,\begin{pmatrix}a\\0\end{pmatrix}\right)=\begin{pmatrix}a\\0\end{pmatrix}.

Example 1.20 Given a morphism of representations, the kernel and the image are subrepresentations of the domain and codomain, respectively.

Example 1.21 In the situation of example 1.8, suppose that Y \subset X is a sub G-set, i.e. we have gY \subset Y for all g \in G, then Y is itself a G-set and if we apply the same construction to Y, the resulting vector space is a subspace of V in a canonical way, and so also a subrepresentation. (This is a special case of the mentioned functoriality of this construction.)
If X is finite, another subrepresentaion is given by the span of \sum_{x \in G} e_x.

We now come to the first substantial theorem about representations.

Theorem 1.22 (Maschke) Let G be a finite group and suppose that the order |G| is invertible in K. Then if V is a finite-dimensional representation and W \leq V is a subrepresentation, then there exists another subrepresentation C such that V=W\oplus C,.

Proof By linear algebra, we can find a K-linear projection \pi: V \to W, i.e. we have that \mathrm{im}(\pi)\subset W and \pi is the identity on W. We have that V= W \oplus \mathrm{ker}(\pi), but of course, \pi will not be a morphism of representations in general. The idea is to “average” \pi to get another projection onto W that is a morphism of representations.
Set \pi'(v)=\frac{1}{|G|}\sum_{g \in G}g\pi(g^{-1}v) (Here we use that |G| is invertible in K). This will be K-linear again. This is a morphism of representations, as for h \in G, we have \pi'(hv)=\frac{1}{|G|} \sum_{g \in G}h\pi(g^{-1}hv)=\frac{1}{|G|}\sum_{g \in G} hg\pi(g^{-1}v)=h(\frac{1}{|G|} \sum_{g \in G} g\pi(g^{-1}v))=h\pi'(v). Since W is a subrepresentation and \pi is the identity on W, \pi' is also the identity on W (it is crucial that we divided by |G| for this step.) and the image is also contained in W, so \pi' is still a projection onto W.
Therefore, the kernel is a complement of W and as \pi' is a morphism of representations, the kernel is a subrepresentation.

Example 1.23 To show that the assumptions in Maschke’s theorem are necessary, consider the transvection representation of the additive group of K on K^2 described in example 1.7 and 1.19. Here K acts via \varphi\left(\lambda,\begin{pmatrix}a\\b\end{pmatrix}\right)=\begin{pmatrix}a+\lambda b\\b\end{pmatrix}. As described in example 1.19, the subspace W of vectors in K^2 with second component 0 is a subrepresentation.
But this subrepresentation doesn’t have a complement that is also a subrepresentation: Indeed, if \begin{pmatrix}a\\b \end{pmatrix} is any vector in K^2 such that b \neq 0, then \begin{pmatrix}a\\b \end{pmatrix} and \varphi\left(1,\begin{pmatrix}a\\b \end{pmatrix}\right)=\begin{pmatrix}a+b\\b \end{pmatrix} are linearly independent, as they are clearly not multiples of each other. Thus any subrepresentation that is not contained in W is the whole of V, so W doesn’t have a complement.
This serves as a counterexample in two different ways: if we take K to be a finite field, it shows that the assumption that the order is invertible is necessary. If we take K to be an infinite field (say of characteristic 0), then it shows that even in characteristic 0, the conclusion doesn’t need to hold when the group is infinite.