Rings, Fields, and Polynomials
One thing we have to take care of is proving that vector fields (what we’ve also referred to as “vector spaces”) have bases. This will allow us to define and use the dimension of a vector space. We’ve already used them in this proof. We try not to use results we haven’t established, but sometimes mistakes are made. Anyway, here we go!
Definition: Span
Let \(V\) be a vector space over a field, \(\mathbb{F}\). Let \( S\) be a set of vectors of \(V\), then
$$ \text{Span}(S) := \{ \vec{v} \in V | \exists n \in \mathbb{N}, \vec{v} = \sum\limits_{i = 1}^{n} c_{i}\vec{s}_{i} \} \\ $$
where \(c_{i} \in \mathbb{F} \) and \(\vec{s}_{i} \in S \).
Note:
\( |S| \), the cardinality of \(S\) can be infinite, but we only allow finite sums, in general. The \( n \) given in the definition of span does not have to be the same for each vector (in fact, if \( \vec{v} \in S \), then \(n\) can be 1).
There are situations where it is possible to take infinite sums, but those require more structure than we have available in the general case.
Definition: Basis
Let \(V\) be a vector space over a field, \(\mathbb{F}\). Let \(\mathcal{B}\) be a subset of \( V \) such that
- \( V = \text{Span}(\mathcal{B}) \)
- For all \( n \in \mathbb{N} \), \( \sum\limits_{i = 1}^{n} c_{i} \vec{v}_{i} = 0 \), implies \(c_{i} = 0\), for all \(i = 1 , … , n\). Where \( \vec{v}_{i} \in \mathcal{B} \), and \( j < i \) implies \( \vec{v}_j \neq \vec{v}_i \). (I.e. all finite subsets of \( \mathcal{B} \) are linearly independent ).
then \( \mathcal{B} \) is said to be a basis for \(V\).
Lemma: a vector space has a basis
Let \(V\) be a vector space over a field, \(\mathbb{F}\), then \(V\) has a basis.
Proof
This proof uses the Axiom of Choice, in particular, it relies on well-ordering all of the vectors of \(V\), which is allowed by this theorem, which relies on the Axiom of Choice.
Let \( \{\vec{v}_{\beta}\}_{\beta < \alpha} \) be a well-ordering of every vector in \(V\). Then we can construct a basis, \( \mathcal{B} \) as follows:
$$ \begin{align} \mathcal{B}_{\emptyset} &:= \{ \vec{v}_{\emptyset} \} \\ \mathcal{B}_{\alpha + 1} &:= \begin{cases} \mathcal{B}_{\alpha}, &\vec{v}_{\alpha + 1} \in \text{Span}(\mathcal{B}_{\alpha}) \\ \mathcal{B}_{\alpha} \cup \{ \vec{v}_{\alpha + 1} \}, &\text{otherwise} \end{cases} \\ \mathcal{B}_{\lambda} &:= \begin{cases} \bigcup\limits_{\beta < \lambda} \mathcal{B}_{\beta}, &\vec{v}_{\lambda} \in \text{Span}( \mathcal{B}_{\beta} ) \text{ for some } \beta < \lambda \\ \bigcup\limits_{\beta < \lambda} \mathcal{B}_{\beta} \cup \{ \vec{v}_{\lambda} \}, &\text{ otherwise } \end{cases} \\ \end{align} $$
This technique is called “transfinite induction”. It was introduced in an earlier entry.
We do this for all \( \beta < \alpha \) (i.e. every vector in \( V \)). Then we set our basis \(\mathcal{B}\) equal to \( \mathcal{B}_{\alpha} \). When we are done, we have a basis that clearly spans all of \(V\) (since every vector has its chance to join \( \mathcal{B} \)).
It’s also not difficult to show that \( \mathcal{B} \) is independent. Suppose, for a contradiction that we had a sum
\( \displaystyle \sum\limits_{i = 1}^{n} c_i \vec{v}_i = 0 \), where \( \vec{v}_i \in \mathcal{B}\) and some of the \(c_i \neq 0\). Since all of the \( c_i \) which are equal to \(0\) don’t affect the sum at all, we will ignore them and re-write the sum so that all of the \(c_i \neq 0\) (when we do this the \(n\) in the sum might be smaller, since we could be summing over fewer terms). We will also write it so that \( v_n \) has the largest index in the well-ordering of \(V\). Then we have
\( \displaystyle \sum\limits_{i = 1}^{n} c_i \vec{v}_{\beta_i} = 0 \), which implies that \( \displaystyle \vec{v}_{\beta_{n}} = \frac{-1}{c_n}\sum\limits_{i = 1}^{n-1} c_i \vec{v}_{\beta_{i}} \), where \( \beta_1 \), \( \beta_2\), … , \( \beta_{n-1} < \beta_n \). But this means that \( \vec{v}_{\beta_{n}} \) should not have been added to \( \mathcal{B}_{\beta_{n}} \), and should therefore not be in \(\mathcal{B}\). A contradiction.
QED
Definition and Lemma: Dimension
Let \( \mathcal{B} \) be a basis for a vector space, \(V\), then we define the dimension of \(V\) to be \( | \mathcal{B} | \).
It is well-defined.
We will often write “\( \text{dim}(V) \)” for the dimension of \(V\), and “\( \text{dim}(V / \mathbb{F}) \)”, when \( \mathbb{F} \) is the field, and we want to make that absolutely clear.
Proof
Let \( \mathcal{B}_1 \) and \( \mathcal{B}_2 \) be two bases of \( V \). Without loss of generality, suppose that \( | \mathcal{B}_1 | < | \mathcal{B}_2 | \).
We will start with the finite case.
Suppose \( |\mathcal{B}_1| = m \) (is finite). so
\[ \mathcal{B}_1 = \{ \vec{u}_1, … , \vec{u}_m \} \]
Let
\[ \mathcal{D} := \{ \vec{v}_1, …, \vec{v}_m \} \]
be \( m \) elements of \( \mathcal{B}_2 \). Then we can build an \( m \times (m+1) \) matrix of equations like this
$$ \begin{array}{ccccc|c} &c_{1,1} \vec{u}_1 &c_{1,2} \vec{u}_2 &\cdots &c_{1,m} \vec{u}_m &\vec{v}_1 \\ &c_{2,1} \vec{u}_1 &c_{2,2} \vec{u}_2 &\cdots &c_{2,m} \vec{u}_m &\vec{v}_2 \\ &\vdots & \vdots & \ddots &\vdots &\vdots \\ &c_{m,1} \vec{u}_1 &c_{m,2} \vec{u}_2 &\cdots &c_{m,m} \vec{u}_m &\vec{v}_m \\ \end{array} $$
Where the sum of the entries in a row to the left of the vertical bar equals the vector on the right side of the bar. I.e:
$$ \sum\limits_{j = 1}^{m} c_{i,j}\vec{u}_j = \vec{v}_i $$
We can do this because the \( m \) vectors of \( \mathcal{B}_1 \) form a basis for \( V \) which (by definition) spans every vector of \( V \), in particular, all of the vectors in \( \mathcal{B}_2 \).
Now we can perform Gaussian Elimination (there are many other resources for it online. We will probably cover it in a later entry, and you’ve probably encountered it before) on the rows of this matrix. When we do that, we end up with a matrix whose lower left is entirely \(0\)’s.
When performing the row-reduction operations on the left-hand side of the vertical bar, we also perform the same operation to the right-hand side of the vertical bar. Then when we’re finished, we see that the rows are equal to a linear combination of the vectors on the right-hand side of the vertical bar.
For example (if \( c_{1,1} \neq 0 \)):
$$ \begin{array}{ccccc|c} &c_{1,1} \vec{u}_1 &c_{1,2} \vec{u}_2 &\cdots &c_{1,m} \vec{u}_m &\vec{v}_1 \\ &0 \vec{u}_1 &(c_{2,2} - \frac{c_{2,1}}{c_{1,1}})\vec{u}_2 &\cdots &(c_{2,m} - \frac{c_{2,1}}{c_{1,1}} )\vec{u}_m &\vec{v}_2 - \frac{c_{2,1}}{c_{1,1}}\vec{v}_1 \\ &\vdots & \vdots & \ddots &\vdots &\vdots \\ &0\vec{u}_1 &(c_{m,2} - \frac{c_{m,1}}{c_{1,1}})\vec{u}_2 &\cdots &(c_{m,m} - \frac{c_{m,1}}{c_{1,1}} )\vec{u}_m &\vec{v}_m -\frac{c_{m,1}}{c_{1,1}}\vec{v}_1 \\ \end{array} $$
Notice that all of the diagonal entries of the resulting matrix (after Gaussian Elimination has been finished) must be non-zero, otherwise, the \(m^{th}\) row would be entirely \(0\)’s, but the right-hand side couldn’t be the zero vector, \(\vec{0}\), since it would be a non-zero linear combination of other basis vectors (that is, it’s a linear combination of all of the vectors on the right-hand side of the vertical bar).
We can then continue our Gaussian elimination to zero-out all of the entries on the upper-right side of the diagonal. Then we’re left with only non-zero entries along the diagonal.
Now what we’ve achieved is representing each element of \( B_1 \) as a linear combination of vectors of \( \mathcal{D} \). However, that’s impossible, since \( B_1 \) spans all of \( V \), including every vector in \( \mathcal{B}_2 \setminus \mathcal{D} \). A contradiction.
Finally, we have to settle the case where \( |\mathcal{B}_1| \) is infinite, but still smaller than \( |\mathcal{B}_2| \).
Now, for each \( \vec{u} \in \mathcal{B}_1 \), we can represent it as a linear combination of finitely-many elements of \( \mathcal{B}_2 \). However, by this result all of the vectors used in all of those linear combinations can’t be all of \( \mathcal{B}_2 \). Therefore, we could take a vector (we’ll call it \( \vec{x} \)) of \( \mathcal{B}_2 \) which we didn’t use in any of the linear combinations to express the vectors of \( \mathcal{B}_1 \), and (similarly to what we did in the finite case) we can now express that vector as a finite sum of the vectors of \( \mathcal{B}_1 \) (since it’s a basis!), and then express each of the vectors from \( \mathcal{B}_1 \) used in the linear combination as a finite linear combination of vectors from \( \mathcal{B}_2 \). By construction, we will not use \( \vec{x} \), and so we’ve expressed \( \vec{x} \) as a finite linear combination of some of the other vectors in \( \mathcal{B}_2 \). This is impossible, though, since \( \mathcal{B}_2 \) was supposed to be a basis. Another contradiction.
QED
Lemma: Tower Law for Vector Spaces
Let \(\mathbb{F}\) be a field, and
\[ \mathcal{V} \subseteq \mathcal{W} \]
be vector spaces over \( \mathbb{F} \), where \( \mathcal{V} \) is also a field. Then
\[ \text{dim}(\mathcal{W}/\mathbb{F}) = \text{dim}(\mathcal{W}/\mathcal{V}) \cdot \text{dim}(\mathcal{V}/\mathbb{F}) \]
Proof
Let \( \mathcal{B}_1 \) be a basis for \( \mathcal{V} \) as a vector space over \( \mathbb{F} \), and \( \mathcal{B}_2 \) be a basis for \( \mathcal{W} \) as a vector space over \( \mathcal{V} \). Since \( \mathcal{V} \) is a field and \( \mathcal{W} \) is a vector space over \( \mathcal{V} \), we can take the product of any \( \vec{v} \in \mathcal{V} \) with any \( \vec{w} \in \mathcal{W} \), and the result be a vector in \( \mathcal{W} \) (Do not confuse this product with the dot product!). We want to show that
$$ \mathcal{B}_3 := \{ \vec{v} \vec{w} | \vec{v} \in \mathcal{B}_1, \vec{w} \in \mathcal{B}_2 \} $$
is a basis for \( \mathcal{W} \) over \( \mathbb{F} \).
First we show that \( \text{Span}(\mathcal{B}_3) = \mathcal{W} \). Let \( \vec{u} \) be an arbitrary vector in \( \mathcal{W} \). Then, since \( \mathcal{B}_2 \) is a basis for \( \mathcal{W} \) over the field \( \mathcal{V} \), we have
$$ \vec{u} = \sum\limits_{i = 1}^{n_{\vec{u}}} c_i \vec{v}_{\beta_{i}} $$
where the \( c_i \) are elements of \( \mathcal{V} \). They are vectors! And, \( \vec{v}_{\beta_i} \in \mathcal{B}_1 \).
Now, since \( \mathcal{B}_1 \) is a basis for \( \mathcal{V} \) over the field \( \mathbb{F} \), we have for each \( c_i \)
$$ c_i = \sum\limits_{j = 1}^{n_{c_i}} d_j \vec{x}_{\alpha_j} $$
where \( d_i \in \mathbb{F}\), and \( \vec{x}_{\alpha_i} \in \mathcal{B}_1 \).
Thus,
$$ \begin{align} \vec{u} &= \sum\limits_{i = 1}^{n_{\vec{u}}} \left ( \sum\limits_{j = 1}^{n_{c_i}} d_j \vec{x}_{{\alpha}_j} \right) \vec{v}_{\beta_i} \\ &= \sum\limits_{i = 1}^{n_{\vec{u}}} \sum\limits_{j = 1}^{n_{c_i}} { d_j \vec{x}_{\alpha_j} \vec{v}_{\beta_i} } \\ \end{align} $$
where \( \vec{x}_{\alpha_j} \in \mathcal{B}_1 \) and \( \vec{v}_{\beta_{i}} \in \mathcal{B}_2 \). Thus \( \text{Span}(\mathcal{B}_3) = \mathcal{W} \).
The second, and last thing we have to show is that \( \mathcal{B}_3 \) is linearly-independent. I.e.
$$ \begin{align} \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{m_i} d_{i,j} \vec{x}_{\alpha_j} \vec{v}_{\beta_i} = \vec{0} \\ \implies d_{i,j} = 0, \text{ for all } i,j \\ \end{align} $$
Suppose we have
$$ \sum\limits_{i = 1}^{n} \sum\limits_{j = 1}^{m_i} d_{i,j} \vec{x}_{\alpha_j} \vec{v}_{\beta_i} = \vec{0} $$
Then, since \( \mathcal{B}_2 \) is a basis for \( \mathcal{W} \), we know that the coefficient (in \( \mathcal{V} \)) for each \( \vec{v}_{\beta_i} \) is \( 0 \in \mathcal{V} \). Since \( \mathcal{V} \) is a vector space (as well as a field), we can also write “\( \vec{0} \)” for the “\( 0 \)” element. Then,
$$ \sum\limits_{j = 1}^{m_i} d_{i,j} \vec{x}_{\alpha_j} = \vec{0} $$
for each \(i\).
Similarly, since \( \mathcal{B}_1 \) is a basis for \( \mathcal{V} \) as a vector space over \( \mathbb{F} \), we must have \(0\)’s for the coefficient of all of the \( \vec{x}_{\alpha_j} \)’s, so each \( d_{i,j} = 0 \).
QED
Definition: Field Extension
Let \( \mathbb{E} \) and \( \mathbb{F} \) be fields, such that \( \mathbb{F} \subseteq \mathbb{E} \), then \( \mathbb{E} \) is said to be a field extension of \( \mathbb{F} \).
Definition: Degree of Field Extension
Let \( \mathbb{F} \subseteq \mathbb{E} \) be fields. \( \mathbb{E} \) is a vector space over \(\mathbb{F}\), and we define the degree of the field extension,
\[ [\mathbb{E}: \mathbb{F}] := \text{dim}(\mathbb{E}/\mathbb{F})\]
Corollary: Tower Law for Field Extensions
Let
\[ \mathbb{F} \subseteq \mathbb{E} \subseteq \mathbb{D} \]
be fields. Then,
\[ [\mathbb{D}:\mathbb{F}] = [\mathbb{D}:\mathbb{E}] \cdot [\mathbb{E}:\mathbb{F}] \]
Proof
This is a direct consequence of the Tower Law for Vector Spaces.
QED
Definition: Polynomial
We’re sure you’ve heard of them, and know what they are, but we have to give a definition before we can use them.
Let \( R \) be a commutative ring, and \( x \) be a variable, then sums of the form
\[ \sum\limits_{i=0}^{n} r_ix^i \]
where \( r_i \in R \) and \( r_n \neq 0 \) are called “polynomials”. In this sum, we stipulate that \( x^0 = 1 \), always. (even if we substitute \( 0 \mapsto x \)).
Note: In the case of the zero polynomial, we have \( r_0 = 0 \). That is the only case where the leading coefficient is allowed to be zero.
Definition: Degree of Polynomial
Let
\[ p(x) = \sum\limits_{i = 0}^{n} r_ix^i \]
be a polynomial, then the degree of \( p(x) \) is equal to \( n \). We sometimes write this as “\( \text{deg}(p) = n \)”.
However if \( p(x) = 0 \), then we define \( \text{deg}(p) = -\infty \), where:
- \( -\infty < n \), for all \( n \in \mathbb{Z} \).
- \( -\infty + n = -\infty \) for all \( n \in \mathbb{Z} \cup \{ -\infty \} \).
- \( -\infty \cdot n = -\infty \), for all \( n \in \mathbb{Z} \cup \{ -\infty \} \) (note that this means that in this case we’re saying that a negative times a negative is still negative).
Definition: Arithmetic with Polynomials
Let \( p(x) = \sum\limits_{i = 0}^{n} r_i x^i\) and \( q(x) = \sum\limits_{j=0}^{m} s_j x^j \), and suppose (without loss of generality) that \( m \leq n \). Then,
Addition:
$$ \begin{align} p(x) + q(x) &:= \sum\limits_{i = 0}^{m} (r_i + s_i)x^i \\ &+ \sum\limits_{i = m + 1}^{n} r_i x^i \\ \end{align} $$
where, if \( m = n \), then \( \displaystyle \sum\limits_{i = m+1}^{n} r_ix^i = 0 \)
Multiplication:
$$ p(x)q(x) := \sum\limits_{k = 0}^{m + n} \left( \sum\limits_{i + j = k} r_i s_j \right ) x^{k} $$
Now we’ll show that the set of all polynomials over a commutative ring is itself a commutative ring.
Lemma: Polynomials form a ring
Let \( R \) be a commutative ring. Then the set of polynomials with coefficients in \( R \) is a commutative ring.
Proof
Notice that the indeterminate (what is often written “\( x \)”) behaves just like any other non-zero element of \( R \). Therefore, when we add it to \( R \) the resulting set still satisfies all of the axioms of a commutative ring.
QED
Definition: Ring of Polynomials
Let \( R \) be a commutative ring, then the ring of polynomials with coefficients in \(R\) is denoted:
“\( \displaystyle R[x] \)” (sometimes “\( R[X] \)”)
Mini-lemma: Degree of sum and product
Let \( R \) be a commutative integral domain, and let \( p(x), q(x) \in R[x] \) such that \( \text{deg}(p) = n \) and \( \text{deg}(q) = m \), then
- \(\displaystyle \text{deg}(p + q) \leq \max(p,q) \)
- \(\displaystyle \text{deg}(pq) = \text{deg}(p) + \text{deg}(q) \)
Proof
The proof is immediate from the definitions given above. The one thing to note is the inequality in 1. This is because if the degrees are equal, so we have
$$ \begin{align} p(x) &= \sum\limits_{i = 0}^{n} r_i x^i \\ q(x) &= \sum\limits_{i = 0}^{n} s_i x^i \\ \end{align} $$
and \( r_n = -s_n \), then
$$ p(x) + q(x) = \sum\limits_{i = 0}^{n-1} (r_i + s_i) x^i $$
The case where one or both of the polynomials is equal to \(0\) is not difficult, and is left to the reader.
QED
Mini-Lemma: Integral Domain implies Polynomials are Integral Domain
Let \( R \) be a commutative ring which is an integral domain, then the ring of polynomials over \(R\), \( R[x] \) is also a commutative integral domain.
Proof
Let \( p(x) \neq 0 \) and \( q(x) \neq 0 \) both be elements of \( R[x] \), then
$$ \begin{align} p(x) &= \sum\limits_{i = 0}^{n} r_i x^i \\ q(x) &= \sum\limits_{i = 0}^{m} s_i x^i \\ \end{align} $$
Since neither polynomial is the zero polynomial, we know that \( r_n \neq 0 \) and \( s_m \neq 0 \), which implies that \( r_ns_m \), the leading coefficient of \( p(x)q(x) \), is not equal to \(0\).
QED