Galois Theory 1: Prelude to the Nonexistence of the General Quintic

Published on
40 min read––– views

Tu prieras publiquement Jacobi ou Gauss de donner leur avis, non sur la vérité, mais sur l'importance des théorèmes.

Après cela, il y aura, j'espère, des gens qui trouveront leur profit à déchiffrer tout ce gâchis.

– E. Galois. Le 29 mai, 18321

(Ask Jacobi or Gauss publicly to give their opinion, not as to the truth, but as to the importance of these theorems. Later there will be, I hope, some people who will find it to their advantage to decipher all this mess.)

Introduction

I love stumbling upon these sorts of mathematical Lacanian diagrams in the wild, be it on Twitter, Wikipedia, or other, dustier corners of the internet. It's like glympsing arcana, or an overly complicated diagram of an extinct theology, a crackhead's projection of metaphysical laws scribbled in soot.

One such diagram in a now-elusive Twitter thread about Galois Theory struck my interest. This post recapitulates and ELI25s some of the cooler proofs from two books of the same name, Galois Theory by Emil Artin2 and Harold M. Edwards 3. The overarching goal for this mini-series of posts is to gain a better understanding of the titular Galois Theory, which I first encountered in my studies of error correct codes.4,5,6 The immediate aim of this post is to gain a better understanding of polynomials and the useful manipulations within polynomial expressions which provide the basis for Galois Theory.

Having completed this post and circling back to try to provide some sort of contextualization of this post (which belongs moreso in the category of notes to self than fun to read) I was pleased by the number of applications of the forthcoming concepts which I'd already touched upon in other posts such as intuitive understanding of some of the representations of tensors7 and group theory8 in general.

Chief Keef was 16 when he dropped Love Sosa, what have you done?

Évariste Galois is one of those few people who can stand toe to toe with Chief Keef. A Frenchman and a budding savant, Galois juggled political activism leading up to the second french revolution with his own mathematical publications, many of most of which were rejected from publication by his senior contemporaries such as Cauchy and Poisson for being "incomprehensible." He was denied from the prestigious École Polytechnique more than once due the examiner's inability to follow his reasonings during oral examinations. He matriculated at a lesser institution –École Polytechnique– from which he was eventually expelled for his scathing remarks targeting the school's director who physically barred him from joining in the July Revolution of 1830 where his compatriots were protesting. In and out of prison for the following months for his Je ne sais quoi activism, Évariste continued to develop his ideas, taking (no matter how poorly or violently) the feedback provided by his erstwhile academic mentors and publishing his articles privately through his friend Auguste Chevalier. He was released from prison on April 29, 1832, and died in a duel one month later at the age of 20.

His seminal paper on the eponymous theory, Mémoire sur les conditions de résolubilité des équations par radicaux, which was initially rejected, was published post-humously fourteen years after his death answering the question about the insolvability of the general quintic which had been an open question in mathematics for over 350 years.9,10

The Root of the Problem

When asked to find the roots of a polynomial expression such as

2x2+3x52x^2 + 3x - 5

most folks know a few ways to come up with the two solutions:

α1,2=b±b24ac2a=3±324(2)(5)2(2)=3±314\begin{aligned} \alpha_{1,2} &= \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} \\ \\ &= \frac{-3 \pm \sqrt{3^2 - 4(2)(5)}}{2(2)} \\ \\ &= \frac{-3 \pm \sqrt{31}}{4} \\ \end{aligned}

Elementary schoolers quake at the foreboding and painful formula, and the curious twenty-year old wonders about its general form for higher order polynomials.

By the time Évariste Galois had this question in the early 1800s, general formulae for polynomials of degree 3 and 4 were already known, but they weren't well understood (at least, not to the degree ((no pun intended)) that Galois would theorize about), and it was only suspected that no general form for quintic polynomials, that is degree 5, polynomials existed.

But where does one even begin to try to prove or disprove the existence of such a formula? Before we try to digest The Galois Theory, which proves the nonexistence of the general quintic equation, let's work our way up.

Fundamental Theorem of Symmetric Polynomials

Consider the following cubic polynomial:

f(x)=x3+2x2+3x5f(x) = x^3 + 2x^2 + 3x - 5
The Fundamental Theorem of Algebra

every non-zero, single-variable, degree nn polynomial with complex coefficients has, counted with multiplicity, exactly nn complex roots.

tells us that f(x)f(x) has exactly three roots over the complex numbers: α1,α2,α3C\alpha_1, \alpha_2, \alpha_3 \in \cnums. Knowing this, we can express our polynomial as a product of linear factors:

f(x)=x3+2x2+3x5=(xα1)(xα2)(xα3)=x3(α1+α2+α3)x2+(α1α2+α1α3+α2α3)xα1α2α3\begin{aligned} f(x) &= x^3 + 2x^2 + 3x - 5 \\ &= (x - \alpha_1)(x - \alpha_2)(x - \alpha_3) \\ &= x^3 - (\alpha_1 + \alpha_2 + \alpha_3)x^2 \\ &\quad + (\alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3)x \\ &\quad- \alpha_1\alpha_2\alpha_3 \\ \end{aligned}

expanding the linear product, we get an expression in which the coefficients of are polynomial are now in terms of our roots. To make this relationship more explicit, we can refer back to the original form of our polynomial and describe the roots as a system of equations:

f(x)=x3+2x2+3x5=x3(α1+α2+α3)x2+(α1α2+α1α3+α2α3)xα1α2α3(α1+α2+α3)=2(α1α2+α1α3+α2α3)=3α1α2α3=5\begin{aligned} f(x) &= x^3 + 2x^2 + 3x - 5 \\ &= x^3 - (\alpha_1 + \alpha_2 + \alpha_3)x^2 \\ &\quad + (\alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3)x \\ &\quad - \alpha_1\alpha_2\alpha_3 \\ \\ -(\alpha_1 + \alpha_2 + \alpha_3) &= \color{purple}2\color{black} \\ (\alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3) &= \color{purple}3\color{black} \\ - \alpha_1\alpha_2\alpha_3 &= \color{purple}-5\color{black} \\ \end{aligned}

and we can move all the negatives to the righthand side for convenience:

(α1+α2+α3)=2(α1α2+α1α3+α2α3)=3α1α2α3=5\begin{aligned} (\alpha_1 + \alpha_2 + \alpha_3) &= \color{purple}-2\color{black} \\ (\alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3) &= \color{purple}3\color{black} \\ \alpha_1\alpha_2\alpha_3 &= \color{purple}5\color{black} \\ \end{aligned}

The basis of the Fundamental Theorem of Symmetric Polynomials asks how much information can we derive from these relations alone. What other meaningful expressions in terms of the roots might fall into the same subset of known and incidentally useful.

To answer these questions, we need a bit more language to describe what it is that we're working with.

Definitions

A Polynomial is any expression that can be constructed from variables, cnstants, joined by addition and multiplication.

Note that when discussing polynomials, we reserve the symbol α\alpha for fixed values which we don't know, rather than the latin letters such as x,y,zx,y,z which we use for variables. Note the subtle, but crucial distinction between the two equations:

α1+α2+α3=2x+y+z=2\begin{aligned} \alpha_1 + \alpha_2 + \alpha_3 = -2 \\ x + y + z = -2 \end{aligned}

The former equation in α\alpha implies that 3 given, unknown complex numbers sum to 2-2, which is readily conceivable. The latter equation in x,y,zx,y,z implies that any combinations of values for x,y,zx,y,z sum to 2-2 which is not a true statement for the constraints, or lack thereof, that we've put on that class of variables.

A Symmetric Polynomial is one whose values don't change when we permute it's variables. For example the polynomial x2+y2+z2x^2 + y^2 + z^2 is said to be symmetric in x,y,zx, y, z, whereas the expression xyx - y is not symmetric in its variables xx and yy since xyyxx - y \neq y - x.

Symmetry is the cornerstone upon which Galois Theory is built, and even prior to fully fledged examination of field theory, we will make use of the properties of symmetric polynomials for several other proofs.

Elementary Symmetric Polynomials in nn variables are a special collection of polynomials whereby the first element is defined by the sum of all variables, the 2nd element by the sum of all products of pairs of variables, the 3rd by the sum of all products of triples, and so on such that the nnth elementary polynomial is simply the product of all nn variables:

e1(x1,...,xn)=x1+x2+x3+...+xne2(x1,...,xn)=x1x2+x1x3+...=ijnxixje3(x1,...,xn)=x1x2x3+xnx2xn1+...=ijknxixjxk  en(x1,...,xn)=x1x2x3...xn\begin{aligned} e_1(x_1, ..., x_n) &= x_1 + x_2 + x_3 + ... + x_n \\ \\ e_2(x_1, ..., x_n) &= x_1x_2 + x_1x_3 + ... \\ &= \sum^n_{i \neq j} x_i x_j \\ \\ e_3(x_1, ..., x_n) &= x_1x_2x_3 + x_nx_2x_{n-1} + ... \\ &= \sum^n_{i \neq j \neq k} x_i x_j x_k \\ &\; \vdots \\ e_n(x_1, ..., x_n) &= x_1x_2x_3...x_n \end{aligned}

These elementary symmetric polynomials encode the pattern by which the roots of an arbitrary polynomial are related to its coefficients. Note that our desirable, known roots

α1+α2+α3,α1α2+α1α3+α2α3,α1α2α3\begin{aligned} \alpha_1 + \alpha_2 + \alpha_3, \\ \alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3, \\ \alpha_1\alpha_2\alpha_3 \end{aligned}

are the elementary symetric polynomials in three variables.

Theorem 1: Newton's Theorem

Returning to the set of "known" relations of roots and coefficients, consider the claim

α13+α23+α33=25 \alpha_1^3 + \alpha_2^3 + \alpha_3^3 = 25

At first, this may seem like a suspiciously lofty assertion, but as we'll see shortly, it is possible to compute the value of any expression that is a sum of powers of our roots. That is, any polynomial of the form:

α1k+α2k+α3k \alpha_1^k + \alpha_2^k + \alpha_3^k

This stems from Newton's Theorem which states that any power sum polynomial in nn variables can be expressed using elementary symmetric polynomials. For example, the power sum polynomial

x3+y3+z3\color{red}x\color{black}^3+ \color{green}y\color{black}^3 + \color{blue}z\color{black}^3

can be written as

=(x+y+z)33(x+y+z)(xy+xz+yz)+3(xyz)=e133e1e2+3e3\begin{aligned} &= (\color{red}x\color{black} + \color{green}y\color{black} + \color{blue}z\color{black})^3 - 3(\color{red}x\color{black} + \color{green}y\color{black} + \color{blue}z\color{black})(\color{red}x\color{black}\color{green}y\color{black} + \color{red}x\color{black}\color{blue}z\color{black} + \color{green}y\color{black}\color{blue}z\color{black}) +3 (\color{red}x\color{black}\color{green}y\color{black}\color{blue}z\color{black})\\ &= e_1^3 - 3e_1e_2 + 3e_3 \end{aligned}

We can make use of this relation by substituting the values of our unknown roots α1,α2,α3\alpha_1, \alpha_2, \alpha_3 as the values of our formula variables x,y,zx,y,z:

=(α1+α2+α3)33(α1+α2+α3)(α1α2+α1α3+α2α3)+3(α1α2α3)=e133e1e2+3e3\begin{aligned} &= (\alpha_1 + \alpha_2 + \alpha_3)^3 - 3(\alpha_1 + \alpha_2 + \alpha_3)(\alpha_1\alpha_2 + \alpha_1\alpha_3 + \alpha_2\alpha_3) +3 (\alpha_1\alpha_2\alpha_3 )\\ &= e_1^3 - 3e_1e_2 + 3e_3 \end{aligned}

and, finally, substituting the roots into the elementary symmetric polynomials gives us values that we do know:

=e133e1e2+3e3=(2)33(2)(3)+3(5)=25\begin{aligned} &= e_1^3 - 3e_1e_2 + 3e_3 \\ &= (\color{purple}-2\color{black})^3 - 3(\color{purple}-2\color{black})(\color{purple}3\color{black}) + 3(\color{purple}5\color{black}) \\ &= 25 \end{aligned}

Next we'll prove that this theorem holds for all polynomials in three variables using a method which holds for polynomials in an arbitrary amount of variables.

Newton's Theorem Proof

Claim: Any kkth power sum polynomial in x,y,zx, y, z can be expressed using elementary symmetric polynomials.

We'll show this visually by constructing a table TT whose columns span the elementary symmetric polynomials in n=3n = 3 variables and whose rows enumerate the sums of ascending powers:

x+y+zx + y + zxy+xz+yzxy + xz + yzxyzxyz
x2+y2+z2x^2 + y^2 + z^2x2y+x2z+y2x+y2z+z2x+z2yx^2y + x^2z + y^2x + y^2z + z^2x + z^2yx2yz+y2xz+z2xyx^2yz + y^2xz + z^2xy
x3+y3+z3x^3 + y^3 + z^3x3y+x3z+y3x+y3z+z3x+z3yx^3y + x^3z + y^3x + y^3z + z^3x + z^3yx3yz+y3xz+z3xyx^3yz + y^3xz + z^3xy
x4+y4+z4x^4 + y^4 + z^4x4y+x4z+y4x+y4z+z4x+z4yx^4y + x^4z + y^4x + y^4z + z^4x + z^4yx4yz+y4xz+z4xyx^4yz + y^4xz + z^4xy
\vdots\vdots\vdots

Note that along the vertical dimension, the power of the first variable increases, and along the second columnal dimension, the length of the individual terms increases from one to nn. Our goal is to show that any entry in the 1st column containing the power sum polynomials can be expressed using entries in the 1st row which contains the elementary symmetric polynomials.

If we only need the first row and only care about the first column, why then fill out the rest of the table you might ask? Astute observation dear reader; from these auxiliary entries we can derive the following fact about the whole table: any table entry in row i>1i > 1 can be algebraically expressed in terms of the rows 1,..,i11, .., i-1 above it.

If we show this fact to be true, we'll complete the whole proof. Suppose we we take the power sum polynomial in the 5th row. According to our proposition, it can be expressed using entries from any of the first 4 rows, which in turn can be expressed using entries from the first three rows, and so on. We regress towards a base case which is that entries in the 2nd row can be expressed in terms of only the first row, and therefore the 5th row, too, can only be expressed in terms of entries from the first row which are the elementary symmetric polynomials.

To show that the claim holds, we take any row i>1i > 1 and any column jj other than the last column s.t. jnj \neq n. Now we will show that the product of the iith power sum polynomial T[i,1]T[i, 1] and the jjth elementary symmetric polynomial T[1,j]T[1, j] is equivalent to the sum of the entries with table indices (i+1,ji + 1, j) and (i,j+1i, j + 1):

T[1,j]×T[i,1]=T[i+1,j]+T[i,j+1]T[1, j] \times T[i, 1] = T[i + 1, j] + T[i, j + 1]

We can visualize the relationship between these entries by masking the rest of the table:

1j=j = 23
1T[1,j]T[1, j]
2
3
i=i= 4T[i,1]T[i, 1]T[i,j+1]T[i, j + 1]
5T[i+1,j]T[i + 1, j]

Expanding this product, we see that is the sum of all possible products of a single variable raised to the power of ii times a term that contains jj distinct variables:

T[1,2]×T[4,1]=T[5,2]+T[4,3]=(x4i=4+y4+z4)(xyj=2+xz+yz)=x5y+x5z+x4yz+  y5x+y4xz+y5z+  z4xy+z5x+z5y\begin{aligned} T[1, 2] \times T[4, 1] &= T[5, 2] + T[4, 3] \\ &= (x^{\overbrace{4}^{i=4}} + y^4 + z^4)(\overbrace{xy}^{j=2} + xz + yz) \\ &= x^5y + x^5z + x^4yz + \\ &\quad\; y^5x + y^4xz + y^5z + \\ &\quad\; z^4xy + z^5x + z^5y \end{aligned}

For each term in the resultant product, one of two things happens:

  1. If the variable raised to the iith power appears in the term of length jj, then their product remains length jj and the exponent of the distinguished variable is incremented e.g.
xixz=xi+1zx^i \cdot xz = x^{i+1}z
  1. Otherwise, if the variable raised to the power of ii does not appear in the term of length jj, then the length of the product is incremented e.g.
xiyzj=xiyzj+1x^i \cdot \underbrace{yz}_{j} = \underbrace{x^iyz}_{j+1}

This pattern defines all terms that might appear in the entries T[i+1,j]T[i + 1, j] and T[i,j+1]T[i, j + 1], and if we rearrange terms in the expansion according to which case they fall into, this becomes obvious:

=x5y+x5z+x4yz+y5x+y4xz+y5z+z4xy+z5x+z5y=(x5y+x5z+y5x+y5z+z5x+z5y)+(x4yz+y4xz+z4xy)\begin{aligned} &= x^5y + x^5z + x^4yz + y^5x + y^4xz + y^5z + z^4xy + z^5x + z^5y \\ &= (x^5y + x^5z + y^5x +y^5z + z^5x + z^5y) + (x^4yz + y^4xz + z^4xy) \end{aligned}

Furthermore, we can slide our variables along this identity we've outlined for i>1,jni > 1, j \neq n to show that any table entry i,ji,j can be expressed using entries in rows above it with some minor technical adjustments for the corner cases of i=1,2;j=ni = 1,2; j = n:

T[i,j]=(T[i1,1]×T[1,j])T[i1,j+1]T[i, j] = (T[i - 1, 1] \times T[1, j]) - T[i - 1, j + 1]

The corner cases are technical, and don't illustrate anything else fascinating about the relationship between syymmetric polynomials and roots, so I'll skip, but see Artin2 or Edwards3 for complete proofs.


Having proved Newton's Theorem, we return to our subset of "known" values. From the coefficients of a polynomial, we can compute the values of the sums of powers of any of its roots. The computation suggested by our tabular approach above is quite awful since we might have to page through many entries, but its sufficient for the time being that such a mechanism exists. This conclusion expands our domain of "known" values from simply the roots of our polynomial to the class of all symetric expression, not just those that bear relevance to an arbitrary polynomial's coefficients!

Theorem 2: Fundamental Theorem on Symmetric Polynomials

Consider another far-out claim that

α12(α2+α3)+α22(α1+α3)+α32(α1+α2)=21 \alpha_1^2(\alpha_2 + \alpha_3) + \alpha_2^2(\alpha_1 + \alpha_3) + \alpha_3^2(\alpha_1 + \alpha_2) = -21

We can derive this from the Fundamental Theorem of Symmetric Polynomials by substituting roots into the general formula which holds for all polynomial variables.

The theorem we will prove states that any symmetric polynomial can be expressed using power sum polynomials. And, since we just showed that any power sum polynomial can be expressed using elementary symmetric polynomials per Newton's Theorem, this amounts to saying that any symmetric polynomial can be expressed using elementary symmetric polynomials.

For example, the following symmetric polynomial

x2(y+z)+y2(x+z)+z2(x+y) \color{red}x\color{black}^2(\color{green}y\color{black} + \color{blue}z\color{black}) + \color{green}y\color{black}^2(\color{red}x\color{black} + \color{blue}z\color{black}) + \color{blue}z\color{black}^2(\color{red}x\color{black} + \color{green}y\color{black})

can be expressed in terms of the following product of power sum polynomials which we denote sks_k where kk is the power:

=(x+y+z)(x2+y2+z2)(x3+y3+z3)=s1s2s3\begin{aligned} &= (\color{red}x\color{black} + \color{green}y\color{black} + \color{blue}z\color{black}) (\color{red}x\color{black}^2 + \color{green}y\color{black}^2 + \color{blue}z\color{black}^2) - (\color{red}x\color{black}^3 + \color{green}y\color{black}^3 + \color{blue}z\color{black}^3) \\ &= s_1s_2 - s_3 \end{aligned}

substituting our α\alpha roots into the variables, we get:

=α12(α2+α3)+α22(α1+α3)+α32(α1+α2)=(α1+α2+α3)(α12+α22+α32)(α13+α23+α33)=(2)(2)(25)=21\begin{aligned} &= \alpha_1^2(\alpha_2 + \alpha_3) + \alpha_2^2(\alpha_1 + \alpha_3) + \alpha_3^2(\alpha_1 + \alpha_2) \\ &= (\alpha_1 + \alpha_2 + \alpha_3) (\alpha_1^2 + \alpha_2^2 + \alpha_3^2) - (\alpha_1^3 + \alpha_2^3 + \alpha_3^3) \\ &= (\color{purple}-2\color{black})(\color{purple}-2\color{black}) - (\color{purple}25\color{black}) \\ &= -21 \end{aligned}

and again, we can prove that this works for any symmetric polynomial. Unsurprisingly, since this is the Fundamental Theorem on Symmetric polynomials, the proof relies on the symmetry of the polynomials.

Fundamental Theorem on Symmetric Polynomials Proof

We begin the proof by induction on the number of variables nn. First, we'll show that the theorem is true for all single-variable symmetric polynomials, then we'll show that if the statement is true for all polynomials in n1n-1 variables, then it is also true for all polynomials in nn variables.

Formally, this first claim is that any polynomial f(x)f(x) can be expressed using power sums in xx. This is true by definition since power sums on a single variable are simply the powers of that variable, and any polynomial in a single variable is expressed in terms of powers of that variable. Easy.

Next, for the non-trivial inductive step to prove that if the statement about polynomials in n1n-1 variables is true, then it also holds for polynomials in nn variables, we take any symmetric polynomial f(x1,...,xn)f(x_1, ..., x_n) and we must show that it can be expressed using power sums of the variables x1,...,xnx_1, ..., x_n:

sk=x1k+x2k+...+xnk s_k = x_1^k + x_2^k + ... + x_n^k

We'll describe the application of a general a procedure on a concrete polynomial to illustrate how it works, noting that it must work for any polynomial. I'll re-use the familiar symmetric polynomial

f(x,y,z)=x2(y+z)+y2(x+z)+z2(x+y) f(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) = \color{red}x\color{black}^2(\color{green}y\color{black} + \color{blue}z\color{black}) + \color{green}y\color{black}^2(\color{red}x\color{black} + \color{blue}z\color{black}) + \color{blue}z\color{black}^2(\color{red}x\color{black} + \color{green}y\color{black})

First, we choose an arbitrary variable in our polynomial, and rearrange it such that the other terms are coefficients of that variable. For example, if we select the variables xx and rearrange ff in terms of polynomial coefficients in the remaining y,zy,z variables, we get:

f=(y2z+z2y)+(y2+z2)x+(y+z)x2 f = (\color{green}y\color{black}^2\color{blue}z\color{black} + \color{blue}z\color{black}^2\color{green}y\color{black}) + (\color{green}y\color{black}^2 + \color{blue}z\color{black}^2)\color{red}x\color{black} + (\color{green}y\color{black} + \color{blue}z\color{black})\color{red}x\color{black}^2

by expanding, grouping, and un-distributing (factoring). Note that the coefficient polynomials themsleves are symmetric in y,z\color{green}y\color{black}, \color{blue}z\color{black}. To justify why this will always be the case, we leverage the symmetry card. Suppose we permute the variables y,z\color{green}y\color{black}, \color{blue}z\color{black} in ff yielding:

f=(z2y+y2z)+(z2+y2)x+(z+y)x2 f = (\color{blue}z\color{black}^2\color{green}y\color{black} + \color{green}y\color{black}^2\color{blue}z\color{black}) + (\color{blue}z\color{black}^2 + \color{green}y\color{black}^2)\color{red}x\color{black} + (\color{blue}z\color{black} + \color{green}y\color{black})\color{red}x\color{black}^2

the resulting value is unchanged by this transformation since ff is symmetric. Equality of these two arrangements of ff implies that the coefficients remain unchanged, thus their values are also still equal after permutation.

This unlocks the induction hypothesis: we're allowed to assume that the statement holds for polynomials in two variables, thus the coefficient polynomials may be expressed using sums of powers of y,z\color{green}y\color{black},\color{blue}z\color{black}. And while we don't technically need this for our formal proof, computing these expression in terms of the elementary symmetric polynomials s1=y+zs_1' = \color{green}y\color{black} + \color{blue}z\color{black} and s2=y2+z2s_2' = \color{green}y\color{black}^2 + \color{blue}z\color{black}^2 helps stay organized reassure us that we're approaching out goal since

f(x,y,z)=12(s13s1s2)+s2x+s1x2 f(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) = \frac{1}{2}(s_1'^3 - s_1's_2') + s_2'\color{red}x\color{black} + s_1'\color{red}x\color{black}^2

now contains power sums which the overarching goal of the proof. However, s1,s2s_1', s_2' are power sums in two variables, and ff is in three. To reintroduce x\color{red}x\color{black}, we leverage the identity from earlier:

yk+zk=(xk+yk+zk)xk\color{green}y\color{black}^k + \color{blue}z\color{black}^k = (\color{red}x\color{black}^k + \color{green}y\color{black}^k + \color{blue}z\color{black}^k) - \color{red}x\color{black}^k

That is – the kkth power sum in variables y,z\color{green}y\color{black},\color{blue}z\color{black} equals the kkth power sum in x,y,z\color{red}x\color{black}, \color{green}y\color{black},\color{blue}z\color{black} minus the kkth power sum of x\color{red}x\color{black}. Abbreviating with our ss terms, we get

s1=s1xs2=s2x2 s_1' = s_1 - \color{red}x\color{black} \\ s_2' = s_2 - \color{red}x\color{black}^2

which enables us to replace the power sums in y,z\color{green}y\color{black},\color{blue}z\color{black} in our working expressions with power sums of x,y,z\color{red}x\color{black},\color{green}y\color{black},\color{blue}z\color{black}:

f=12(s13s1s2)+s2x+s1x2=12((s1x)3(s1x)(s1x2))+(s2x2)x+(s1x)x2=12(s13s1s2)+32(s2s12)x+3s1x23x3\begin{aligned} f &= \frac{1}{2}(s_1'^3 - s_1's_2') + s_2'\color{red}x\color{black} + s_1'\color{red}x\color{black}^2 \\ &= \frac{1}{2}\Big((s_1 - \color{red}x\color{black})^3 - (s_1 - \color{red}x\color{black})(s_1 - \color{red}x\color{black}^2)\Big) + (s_2 - \color{red}x\color{black}^2)\color{red}x\color{black} + (s_1 - \color{red}x\color{black})\color{red}x\color{black}^2 \\ &= \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{red}x\color{black} + 3s_1\color{red}x\color{black}^2 - 3\color{red}x\color{black}^3 \end{aligned}

Leaving us with a polynomial whose coefficients are themselves polynomial in s1,s2s_1, s_2. Finally, we need to remove the lingering explicit instances of x\color{red}x\color{black}, and the key insight enabling this again stems from the fact that this is still a symmetric polynomial in x,y,z\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black} despite our substitutions since

s1=x+y+zs2=x2+y2+z2 s_1 = \color{red}x\color{black} + \color{green}y\color{black} + \color{blue}z\color{black} \\ s_2 = \color{red}x\color{black}^2 + \color{green}y\color{black}^2 + \color{blue}z\color{black}^2 \\

If we perform another permutation e.g.

and observe the resultant expression, we find that since our coefficient polynomials are symmetric, the only change is that and we now have ff in terms of y \color{green}y\color{black}:

=12(s13s1s2)+32(s2s12)y+3s1y23y3 = \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{green}y\color{black} + 3s_1 \color{green}y\color{black}^2 - 3 \color{green}y\color{black}^3

Similarly, permuting x,z\color{red}x\color{black},\color{blue}z\color{black} gives us the same polynomial in zz:

=12(s13s1s2)+32(s2s12)z+3s1z23z3 = \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{blue}z\color{black} + 3s_1\color{blue}z\color{black}^2 - 3\color{blue}z\color{black}^3

It might seem gratuitous to have enumerated these permutations of x\color{red}x\color{black}, but now we have a system of equations of our polynomial ff with power sum coefficients in x,y,z\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}:

fx(x,y,z)=12(s13s1s2)+32(s2s12)x+3s1x23x3fy(x,y,z)=12(s13s1s2)+32(s2s12)y+3s1y23y3fz(x,y,z)=12(s13s1s2)+32(s2s12)z+3s1z23z3\begin{aligned} f_x(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) &= \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{red}x\color{black} + 3s_1\color{red}x\color{black}^2 - 3\color{red}x\color{black}^3 \\ \\ f_y(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) &= \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{green}y\color{black} + 3s_1\color{green}y\color{black}^2 - 3\color{green}y\color{black}^3 \\ \\ f_z(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) &= \frac{1}{2}(s_1^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)\color{blue}z\color{black} + 3s_1\color{blue}z\color{black}^2 - 3\color{blue}z\color{black}^3 \end{aligned}

Summing over our equations, we get:

3f(x,y,z)=32(s23s1s2)+32(s2s12)(x+y+z)+3s1(x2+y2+z2)3(x3+y3+z3)=32(s23s1s2)+32(s2s12)s1+3s1s23s3\begin{aligned} 3f(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) &= \frac{3}{2}(s_2^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)(\color{red}x\color{black} + \color{green}y\color{black} + \color{blue}z\color{black}) \\ &\quad + 3s_1(\color{red}x\color{black}^2 + \color{green}y\color{black}^2 + \color{blue}z\color{black}^2) - 3(\color{red}x\color{black}^3 + \color{green}y\color{black}^3 + \color{blue}z\color{black}^3) \\ \\ &= \frac{3}{2}(s_2^3 - s_1s_2) + \frac{3}{2}(s_2 - s_1^2)s_1 + 3s_1s_2 - 3s_3 \\ \\ \end{aligned}

leaving us with an expression entirely in terms of power sums! Simplifying, we get

f(x,y,z)=s1s2s3\begin{aligned} f(\color{red}x\color{black}, \color{green}y\color{black}, \color{blue}z\color{black}) = s_1s_2 - &s_3 \quad \square \end{aligned}

Conclusion

We've shown that, given a polynomial f(x)=cnxn+cn1xn1+...+c0f(x) = c_nx^n + c_{n-1}x^{n-1} + ... + c_0 with roots α1,...,αn\alpha_1, ..., \alpha_n, if we consider first the space of all expressions in the roots, then observe the special properties of the subset of expressions in those expressions which are symmetric in the roots, it is possible to express all symmetric polynomials in terms of the elementary power sum polynomials.

This is pertinent to Galois Theory since the former space of all expressions in the roots is known as the splitting field of f(x)f(x) in which Galois Theory leverages multiple symmetries of the roots which, up until this point, we've been treating as a binary property: a polynomial is symemtric or it isn't. Galois Theory compares cardinalities and "axes" of symmetries between expressions' roots.

Armed with a bit more intuition about the power of symmetry within polynomial expressions, claims such as "s3s_3 is more symmetric than (2,3)(2,3)" or "(1,3)(1,3) symmetric in a different way than (1,2)(1, 2)" start to become less nebulous, even without understanding anything else about the object(s) being referred to.

The next post will unpack these relations in more depth.

References

Footnotes

  1. Galois, Évariste. Letter from Galois to Auguste Chevalier dated May 1832. Journal de mathématiques pures et appliquées, 1836.

  2. Artin, Emil. "Galois Theory." Dover Publications, 1997. 2

  3. Edwards, Harold M. "Galois Theory, 3rd Edition." Springer, 1984. 2

  4. Voyager 2

  5. Information Theory

  6. Librarians, Luhn, and Lizard Brain

  7. 2.3728596

  8. Smiting the Demon Number: How to Solve a Rubik's Cube

  9. Évariste Galois Biography. mathshistory.

  10. Neumann, Peter M. "The mathematical writings of Évariste Galois." European Mathematical Scoiety, 2011.