More Introductory Group Theory
We’ve already introduced some of the topics of Group Theory. In this entry we’ll talk about Lagrange’s Theorem, various morphisms, and try to tie it back to our ongoing discussion of permutations. We’ll also discuss some examples of groups.
We previously showed that the set of permutations of a set forms a group (in Some of the Math of Permutations). We also proved some basic properties of those permutations here More Math of Permutations.
In math, a group is defined to be a set of elements with a binary operation. A binary operation is nothing more nor less than a function, which takes two elements of the set and returns one element of the set. This can occasionally be written as \(f(x,y) = z \), where \( x \), \( y\), and \( z\) are elements of the set. However, it is much (much) more common to write this as \( x \circ y = z \), \(x \cdot y = z \), \( x + y = z \), (when we are guaranteed that \( x + y = y + x \)) or sometimes just write \( xy = z \). The elements of the set and the operation have to satisfy a few conditions. When we discuss these conditions below, we will refer to the group as \(G\), and also refer to the set of elements of the group as \(G\). This shouldn’t cause too much confusion, though. We’ll write \( \circ \) for the operator. The conditions are
Group Axioms
- There is an identity element in \( G\), often written \(e\), with the property that for any \(x \) in \(G\), \( x \circ e = x = e \circ x \).
- For any three elements of \( x \), \( y\), and \( z\) in \( G\), the following always holds \( x \circ (y \circ z) = (x \circ y) \circ z \). So we may unambiguously refer to the result as \( x \circ y \circ z \).
- For any element \(x\) in \(G\), there is an element, \(y \) in \(G\), such that \( x \circ y = y \circ x = e \). This element is often, but not always, written as \( x^{-1}\), and is referred to as the inverse of \(x\). We prove below that this definition makes sense, because inverses are unique.
- For any two elements, \(x\), and \(y\) in \(G\), \(x \circ y \) is also an element of \(G\).
An example of a group is the set of rational numbers \( \mathbb{Q} \) where \( x \circ y = x + y \). In this group, the identity element is \( 0 \), and the inverse of a number is just its negative.
You can verify by induction on the number of \( \circ \) signs in an expression that no matter what order we evaluate \( x_1 \circ x_2 \circ x_3 \circ … \circ x_n \) we get the same result. For 2 \(\circ \)’s, we just use property 2. If we assume that the expression is unambiguous for \(k \) or fewer \(\circ\)’s, then an expression with \(k+1\) \(\circ\)’s can be broken down as
$$ \begin{align} &(x_1 \circ x_2 \circ … \circ x_l) \circ \\ &( x_{l+1} \circ x_{l+2} \circ … \circ x_{k} \circ x_{k+1} \circ x_{k+2}) \\ \end{align} $$
Where \( 1 \leq l \leq k + 1 \). Notice that having \( k + 1 \) \( \circ\)’s means we have to have \( k+2\) \(x_i\)’s. What we’ve done is broken the expression into two parts with a left pair of parentheses and a right pair of parentheses. If either the left or the right pair of parentheses encloses just one variable (so that no \(\circ\) is enclosed), we may omit that pair of parentheses. Both of the pairs of parentheses must enclose fewer than \( k \) \(\circ\)’s, because there is a \(\circ\) between them. By induction both the left and the right side are unambiguous. That is, they evaluate to a unique element of the group. Therefore the entire expression with \(k+1\) \(\circ\)’s does.
QED
We’ll now prove a few more basic results about groups.
Lemma (Identities are unique)
Let \(e \in G \) be an identity. No other element of \(G\) is an identity.
Proof
Suppose \(e \) and \(f\) are two identities in \(G\). Then by definition, \(e \circ f = f \), since \(e \) is an identity (see property 1 above). Likewise, since \(f \) is an identity \( e \circ f = e \), then \(e = f\).
QED
Lemma (Inverses are unique)
Property 3 above just guarantees the existence of inverses, but doesn’t necessarily guarantee their uniqueness. However, they are unique.
Proof
Let \(x \) be any element of \(G\). Suppose \(y \) and \( z\) are two inverses of \(x\). Then by 2 above \( z \circ (x \circ y) = (z \circ x ) \circ y \). Notice that the left side of the equation is equal to \( z \circ e = z\) while the right side of the equation is equal to \( e \circ y = y \), so \(z = y\).
QED
Definition (subgroup)
A subset, \(H\) of \(G\) which is closed under inverses and also closed under \( \circ\) is called a subgroup of \(G\). Specifically, For any \( x \in H \), \(x^{-1} \in H \). For any two \(x\), and \(y \in H \), \(x \circ y \in H\).
Definition (coset)
Let \(H\) be a subgroup of \(G\), then for any \(a \in G\), the set \( aH := \{ a \circ h | h \in H \} \), a subset (but not necessarily a subgroup!) of \(G\) is said to be a (left) coset of \(H\). Similarly, you can define a right coset for \(a\) as \( Ha := \{ h \circ a | h \in H \} \).
Mini Lemma (cancellations)
If \(a \circ b = c \circ d \), then \(a = c \circ d \circ b^{-1}\). If \( a \circ b = c \circ b \), then \( a = c \).
Proof
Since we’ve shown that inverses are unique, there is a unique \(b^{-1} \) in our group (or subgroup), so we can apply our operation to the right side of both sides of the equations. Doing this to \( a \circ b \) gives us \( a \circ b \circ b^{-1} = a \circ (b \circ b^{-1} \), this is equal to \(a \circ e = a \). Similarly applying it to \( c \circ b \) results in \( c \). Thus we have proven the second equality above.
Applying \(b^{-1}\) to the right and side of \( c \circ d \) results in \( (c \circ d) \circ b^{-1} = c \circ d \circ b^{-1}\). This proves the first equality above.
QED
Lemma (cosets are equal or disjoint)
Let \(a \) and \(b\) be two elements of \(G\), then either \(aH = bH\), similarly \( Ha = Hb\), or \(aH \cap bH = \emptyset \), similarly \( Ha \cap bH = \emptyset\).
Proof
We’re only going to prove the lemma for left cosets. The proof for right cosets is very similar. In the following proof, we’re going to omit the \( \circ \) between elements of \(G\), so in what follows, \( ab = a \circ b \).
If \(aH \cap bH = \emptyset \), we’re covered by case 1, so assume that their intersection isn’t empty. Let \(c \in aH \cap bH \). Then \(c = a h_1 \), and \(c = b h_2 \), for some \(h_1 \) and \(h_2\) in \(H\). Since \(H\) is a subgroup of \(G\), if \(h_1 \in H\), then \(h_1^{-1} \in H\). We can then cancel the \(h_1\) in the equation for \(a\) using case 2 of the mini lemma above. We’ll apply case 1 to the equation for \(c\) with \(b\), this gives us
\[a = bh_2h_1^{-1} \tag{1}\label{1} \]
Next we’re going to use equation (1) to show that \(aH \subseteq bH \).
Let \(x \) be an arbitrary element of \(aH\), then \(x = ah_3 \), for some \(h_3 \in H\). Applying equation (1) for \(a\) in the equation for \(x\) gives us
\[ x = bh_2h_1^{-1}h_3 \tag{2}\label{2}\]
Notice that since \(H\) is a subgroup, it has to be closed under inverses and also under the group operation, so \( h_2 h_1^{-1} h_3 \) must be an element of our subgroup \(H\). This means that \( h_2 h_1^{-1} h_3 = h_4 \), where \(h_4\) is some element of \(H\). Then applying that to equation (2) gives us \( x = b h_4\), so then \( x \in bH \). Since \(x\) was an arbitrary element of \(aH\), we have shown that \( aH \subseteq bH \).
A similar proof that \(bH \subseteq aH \) can be obtained by swapping the \(a \) and the \(b\) in the previous few paragraphs. This means that \(aH = bH\).
Thus, if we’re not in case 1, then we’re in case 2 of the lemma.
QED
Lemma (cosets have same size as H)
Any two cosets of a given subset \(H\) have the same size. In fact, they have the same size as \(H\).
Proof
Let \(aH\) and \(bH\) be two (not necessarily distinct) cosets of \(H\), a subgroup of \(G\). Define \[ f_{a, b} (x) := ba^{-1}x \]
Now we’re going to show that \(f_{a,b} : aH \xrightarrow{\text{1-1}} bH \). I.e. that \(f_{a,b}\) is a 1-1 function from \(aH \) into \(bH\). That will prove that \(bH\) has at least as many elements as \(aH\). I.e. \( | aH | \leq | bH| \).
Let \(x \in aH\), then \( x = ah\), for some \( h \in H\). Then \( f_{a,b}(x) = f_{a,b}(ah) = ba^{-1}(ah)\) That equals \(beh = bh \in bH\). Therefore \( f_{a,b}: aH \rightarrow bH \). Now suppose that \(f_{a,b} (x) = f_{a,b}(y) \). Then \((ba^{-1})x = (ba^{-1})y\), so by the cancellations mini lemma, \(x = y\). Therefore \(f_{a,b}\) is 1-1. and \( |aH| \leq |bH| \).
By considering the function \(f_{b,a}\), we can show that \( |aH| \geq |bH| \), and so \(|aH| = |bH|\). Notice that \(eH = H\) is a coset, so every coset has the same size as \(H\). QED.
The last two lemmas were originally proved by Lagrange, and are sometimes referred to as (one of) Lagrange’s Theorems. They allow us to define the index of a subgroup.
Definition (index)
The index of a subgroup, \(H\) of a group \(G\), written \( [G:H] \) is defined to be the number of distinct cosets of \(H\). The two lemmas above allow us to prove a nice corollary (this was also proved by Lagrange, and I think this is what’s usually referred to as (one of) his theorems).
Corollary (Lagrange’s Theorem)
If \(G\) is a finite group, then \( |H| | |G| \), i.e. the size of \(H\) divides the size of \(G\).
Next we prove that the index of a subgroup of a subgroup behaves nicely.
Theorem (Tower Law)
Let \(A\) be a group, \(B\) a subgroup of \(A\) of finite index and \(C\) a subgroup of \(B\) of finite index. Then \[ [A : C ] = [A : B ] \cdot [ B : C ] \]
Proof
Let \( [A : B ] = m \) and \( [B : C ] = n \), then there are \(m\) distinct \( a_i \in A \), such that \( \{ a_1B, a_2B, … , a_mB \} \) is the set of \(m\) distinct cosets of \(B \) which are subsets of \(A\). Likewise there are \(n\) distinct \(b_j \in B\), such that \( \{ b_1C, b_2C, … , b_nC \} \) is the set of \(n\) distinct cosets of \(C\) which are subsets of \(B\). We want to show that the \(a_i \cdot b_j \) are all distinct and moreover that the cosets \(a_ib_jC \) are all distinct cosets of \(C \) as subsets of \(A\). This would prove that there are \(m \cdot n \) distinct cosets \( a_ib_jC \subseteq A \).
Notice that \(b_jC \) and \(b_lC \) are both subsets of \(B \), by definition. By the lemma above that says that cosets are either equal or completely disjoint, in order for \(a_ib_jC \) to be equal to \(a_kb_lC\), they must both be subsets of the same coset of \(B\). This means that \(a_i = a_k\). Thus we have
\[ a_ib_jC = a_ib_lC \]
Now let’s apply \( a_i^{-1} \) to the left side of both cosets. This is just repeatedly applying the mini cancellation lemma to each element of both cosets. In fact, it’s applying cancellation on the left, instead of cancellation on the right, but the proof of that is the same as what we showed above, but with the order of the elements reversed. When we do that, we’re left with
\[ b_jC = b_lC \]
We defined the \(b_j\)’s to represent distinct cosets of \(C\) as subsets of the set \(B\). That means that \(b_j = b_l\). This finishes the proof.
QED
Finally (for this entry), we’re going to introduce the idea of different kinds of morphisms between groups. Let \(A\) and \(B\) be groups. Let \(\phi\) be a function from the set of elements of \(A\) to the set of elements of \(B\). \(A\) and \(B\) don’t have to necessarily be distinct groups. In the definitions below, we’ll use \( \circ \) again to represent the operation. In fact, we’ll use \( \circ_{A} \) to represent \(A\)’s binary operation and \( \circ_{B} \) to represent \(B\)’s binary operation.
You should also be able to convince yourself that the tower law works even when one or both indices is infinite. In that case the product is also infinite. The proof given above will still work with slight modification.
Definition (homomorphism)
If \( \phi: A \rightarrow B \), and for every \(x\), and \(y\) in \(A\), \( \phi(x \circ_{A} y ) = \phi(x) \circ_{B} \phi(y)\), then \( \phi \) is said to be a homomorphism from \(A\) into \(B\).
Definition (epimorphism)
If \(\phi\) is a homomorphism from \(A\) to \(B\), and \(\phi\) is onto (i.e. \(Range(\phi) = B\)), then it is said to be an epimorphism.
Definition (monomorphism)
If \(\phi\) is a homomorphism from \(A\) to \(B\) and \(\phi\) is 1-1, then it is said to be a monomorphism.
Definition (isomorphism)
If \( \phi \) is a homomorphism from \(A\) into \(B = Range(\phi)\) and \( \phi : A \xrightarrow{\text{1-1}} B \), then \( \phi \) is said to be an isomorphism from \(A\) into \(B\).
Definition (automorphism)
If \(\phi\) is an isomorphism and \(A = B\), then \(\phi\) is said to be an automorphism.
Definition (endomorphism)
If \( \phi \) is a homomorphism, from \(A\) to \(A\) and \(Range(\phi)\) is not necessarily all of \(A\), then \( \phi\) is said to be an endomorphism.
You can check for yourself that if \(A\) is defined to be the real numbers with \( x \circ_{A} y = x + y\), and \(B\) is defined to be the set of positive real numbers with \( x \circ_{B} y = x \cdot y \), then \( \phi : A \rightarrow B \), where \( \phi(x) = \exp(x) \) is an isomorphism from \(A \) into \(B\).
That’s all for now, folks. We’ll continue the discussion of groups soon, though.