Galois Theory 1: Prelude to the Nonexistence of the General Quintic
- Published on
- ∘ 40 min read ∘ ––– views
Tu prieras publiquement Jacobi ou Gauss de donner leur avis, non sur la vérité, mais sur l'importance des théorèmes.
Après cela, il y aura, j'espère, des gens qui trouveront leur profit à déchiffrer tout ce gâchis.
– E. Galois. Le 29 mai, 18321
(Ask Jacobi or Gauss publicly to give their opinion, not as to the truth, but as to the importance of these theorems. Later there will be, I hope, some people who will find it to their advantage to decipher all this mess.)
Introduction
I love stumbling upon these sorts of mathematical Lacanian diagrams in the wild, be it on Twitter, Wikipedia, or other, dustier corners of the internet. It's like glympsing arcana, or an overly complicated diagram of an extinct theology, a crackhead's projection of metaphysical laws scribbled in soot.
One such diagram in a now-elusive Twitter thread about Galois Theory struck my interest. This post recapitulates and ELI25s some of the cooler proofs from two books of the same name, Galois Theory by Emil Artin2 and Harold M. Edwards 3. The overarching goal for this mini-series of posts is to gain a better understanding of the titular Galois Theory, which I first encountered in my studies of error correct codes.4,5,6 The immediate aim of this post is to gain a better understanding of polynomials and the useful manipulations within polynomial expressions which provide the basis for Galois Theory.
Having completed this post and circling back to try to provide some sort of contextualization of this post (which belongs moreso in the category of notes to self than fun to read) I was pleased by the number of applications of the forthcoming concepts which I'd already touched upon in other posts such as intuitive understanding of some of the representations of tensors7 and group theory8 in general.
Chief Keef was 16 when he dropped Love Sosa, what have you done?
Évariste Galois is one of those few people who can stand toe to toe with Chief Keef. A Frenchman and a budding savant, Galois juggled political activism leading up to the second french revolution with his own mathematical publications, many of most of which were rejected from publication by his senior contemporaries such as Cauchy and Poisson for being "incomprehensible." He was denied from the prestigious École Polytechnique more than once due the examiner's inability to follow his reasonings during oral examinations. He matriculated at a lesser institution –École Polytechnique– from which he was eventually expelled for his scathing remarks targeting the school's director who physically barred him from joining in the July Revolution of 1830 where his compatriots were protesting. In and out of prison for the following months for his Je ne sais quoi activism, Évariste continued to develop his ideas, taking (no matter how poorly or violently) the feedback provided by his erstwhile academic mentors and publishing his articles privately through his friend Auguste Chevalier. He was released from prison on April 29, 1832, and died in a duel one month later at the age of 20.
His seminal paper on the eponymous theory, Mémoire sur les conditions de résolubilité des équations par radicaux, which was initially rejected, was published post-humously fourteen years after his death answering the question about the insolvability of the general quintic which had been an open question in mathematics for over 350 years.9,10
The Root of the Problem
When asked to find the roots of a polynomial expression such as
most folks know a few ways to come up with the two solutions:
Elementary schoolers quake at the foreboding and painful formula, and the curious twenty-year old wonders about its general form for higher order polynomials.
By the time Évariste Galois had this question in the early 1800s, general formulae for polynomials of degree 3 and 4 were already known, but they weren't well understood (at least, not to the degree ((no pun intended)) that Galois would theorize about), and it was only suspected that no general form for quintic polynomials, that is degree 5, polynomials existed.
But where does one even begin to try to prove or disprove the existence of such a formula? Before we try to digest The Galois Theory, which proves the nonexistence of the general quintic equation, let's work our way up.
Fundamental Theorem of Symmetric Polynomials
Consider the following cubic polynomial:
The Fundamental Theorem of Algebra
every non-zero, single-variable, degree polynomial with complex coefficients has, counted with multiplicity, exactly complex roots.
tells us that has exactly three roots over the complex numbers: . Knowing this, we can express our polynomial as a product of linear factors:
expanding the linear product, we get an expression in which the coefficients of are polynomial are now in terms of our roots. To make this relationship more explicit, we can refer back to the original form of our polynomial and describe the roots as a system of equations:
and we can move all the negatives to the righthand side for convenience:
The basis of the Fundamental Theorem of Symmetric Polynomials asks how much information can we derive from these relations alone. What other meaningful expressions in terms of the roots might fall into the same subset of known and incidentally useful.
To answer these questions, we need a bit more language to describe what it is that we're working with.
Definitions
A Polynomial is any expression that can be constructed from variables, cnstants, joined by addition and multiplication.
Note that when discussing polynomials, we reserve the symbol for fixed values which we don't know, rather than the latin letters such as which we use for variables. Note the subtle, but crucial distinction between the two equations:
The former equation in implies that 3 given, unknown complex numbers sum to , which is readily conceivable. The latter equation in implies that any combinations of values for sum to which is not a true statement for the constraints, or lack thereof, that we've put on that class of variables.
A Symmetric Polynomial is one whose values don't change when we permute it's variables. For example the polynomial is said to be symmetric in , whereas the expression is not symmetric in its variables and since .
Symmetry is the cornerstone upon which Galois Theory is built, and even prior to fully fledged examination of field theory, we will make use of the properties of symmetric polynomials for several other proofs.
Elementary Symmetric Polynomials in variables are a special collection of polynomials whereby the first element is defined by the sum of all variables, the 2nd element by the sum of all products of pairs of variables, the 3rd by the sum of all products of triples, and so on such that the th elementary polynomial is simply the product of all variables:
These elementary symmetric polynomials encode the pattern by which the roots of an arbitrary polynomial are related to its coefficients. Note that our desirable, known roots
are the elementary symmetric polynomials in three variables.
Theorem 1: Newton's Theorem
Returning to the set of "known" relations of roots and coefficients, consider the claim
At first, this may seem like a suspiciously lofty assertion, but as we'll see shortly, it is possible to compute the value of any expression that is a sum of powers of our roots. That is, any polynomial of the form:
This stems from Newton's Theorem which states that any power sum polynomial in variables can be expressed using elementary symmetric polynomials. For example, the power sum polynomial
can be written as
We can make use of this relation by substituting the values of our unknown roots as the values of our formula variables :
and, finally, substituting the roots into the elementary symmetric polynomials gives us values that we do know:
Next we'll prove that this theorem holds for all polynomials in three variables using a method which holds for polynomials in an arbitrary amount of variables.
Newton's Theorem Proof
Claim: Any th power sum polynomial in can be expressed using elementary symmetric polynomials.
We'll show this visually by constructing a table whose columns span the elementary symmetric polynomials in variables and whose rows enumerate the sums of ascending powers:
Note that along the vertical dimension, the power of the first variable increases, and along the second columnal dimension, the length of the individual terms increases from one to . Our goal is to show that any entry in the 1st column containing the power sum polynomials can be expressed using entries in the 1st row which contains the elementary symmetric polynomials.
If we only need the first row and only care about the first column, why then fill out the rest of the table you might ask? Astute observation dear reader; from these auxiliary entries we can derive the following fact about the whole table: any table entry in row can be algebraically expressed in terms of the rows above it.
If we show this fact to be true, we'll complete the whole proof. Suppose we we take the power sum polynomial in the 5th row. According to our proposition, it can be expressed using entries from any of the first 4 rows, which in turn can be expressed using entries from the first three rows, and so on. We regress towards a base case which is that entries in the 2nd row can be expressed in terms of only the first row, and therefore the 5th row, too, can only be expressed in terms of entries from the first row which are the elementary symmetric polynomials.
To show that the claim holds, we take any row and any column other than the last column s.t. . Now we will show that the product of the th power sum polynomial and the th elementary symmetric polynomial is equivalent to the sum of the entries with table indices () and ():
We can visualize the relationship between these entries by masking the rest of the table:
1 | 2 | 3 | |
---|---|---|---|
1 | |||
2 | |||
3 | |||
4 | |||
5 |
Expanding this product, we see that is the sum of all possible products of a single variable raised to the power of times a term that contains distinct variables:
For each term in the resultant product, one of two things happens:
- If the variable raised to the th power appears in the term of length , then their product remains length and the exponent of the distinguished variable is incremented e.g.
- Otherwise, if the variable raised to the power of does not appear in the term of length , then the length of the product is incremented e.g.
This pattern defines all terms that might appear in the entries and , and if we rearrange terms in the expansion according to which case they fall into, this becomes obvious:
Furthermore, we can slide our variables along this identity we've outlined for to show that any table entry can be expressed using entries in rows above it with some minor technical adjustments for the corner cases of :
The corner cases are technical, and don't illustrate anything else fascinating about the relationship between syymmetric polynomials and roots, so I'll skip, but see Artin2 or Edwards3 for complete proofs.
Having proved Newton's Theorem, we return to our subset of "known" values. From the coefficients of a polynomial, we can compute the values of the sums of powers of any of its roots. The computation suggested by our tabular approach above is quite awful since we might have to page through many entries, but its sufficient for the time being that such a mechanism exists. This conclusion expands our domain of "known" values from simply the roots of our polynomial to the class of all symmetric expression, not just those that bear relevance to an arbitrary polynomial's coefficients!
Theorem 2: Fundamental Theorem on Symmetric Polynomials
Consider another far-out claim that
We can derive this from the Fundamental Theorem of Symmetric Polynomials by substituting roots into the general formula which holds for all polynomial variables.
The theorem we will prove states that any symmetric polynomial can be expressed using power sum polynomials. And, since we just showed that any power sum polynomial can be expressed using elementary symmetric polynomials per Newton's Theorem, this amounts to saying that any symmetric polynomial can be expressed using elementary symmetric polynomials.
For example, the following symmetric polynomial
can be expressed in terms of the following product of power sum polynomials which we denote where is the power:
substituting our roots into the variables, we get:
and again, we can prove that this works for any symmetric polynomial. Unsurprisingly, since this is the Fundamental Theorem on Symmetric polynomials, the proof relies on the symmetry of the polynomials.
Fundamental Theorem on Symmetric Polynomials Proof
We begin the proof by induction on the number of variables . First, we'll show that the theorem is true for all single-variable symmetric polynomials, then we'll show that if the statement is true for all polynomials in variables, then it is also true for all polynomials in variables.
Formally, this first claim is that any polynomial can be expressed using power sums in . This is true by definition since power sums on a single variable are simply the powers of that variable, and any polynomial in a single variable is expressed in terms of powers of that variable. Easy.
Next, for the non-trivial inductive step to prove that if the statement about polynomials in variables is true, then it also holds for polynomials in variables, we take any symmetric polynomial and we must show that it can be expressed using power sums of the variables :
We'll describe the application of a general a procedure on a concrete polynomial to illustrate how it works, noting that it must work for any polynomial. I'll re-use the familiar symmetric polynomial
First, we choose an arbitrary variable in our polynomial, and rearrange it such that the other terms are coefficients of that variable. For example, if we select the variables and rearrange in terms of polynomial coefficients in the remaining variables, we get:
by expanding, grouping, and un-distributing (factoring). Note that the coefficient polynomials themsleves are symmetric in . To justify why this will always be the case, we leverage the symmetry card. Suppose we permute the variables in yielding:
the resulting value is unchanged by this transformation since is symmetric. Equality of these two arrangements of implies that the coefficients remain unchanged, thus their values are also still equal after permutation.
This unlocks the induction hypothesis: we're allowed to assume that the statement holds for polynomials in two variables, thus the coefficient polynomials may be expressed using sums of powers of . And while we don't technically need this for our formal proof, computing these expression in terms of the elementary symmetric polynomials and helps stay organized reassure us that we're approaching out goal since
now contains power sums which the overarching goal of the proof. However, are power sums in two variables, and is in three. To reintroduce , we leverage the identity from earlier:
That is – the th power sum in variables equals the th power sum in minus the th power sum of . Abbreviating with our terms, we get
which enables us to replace the power sums in in our working expressions with power sums of :
Leaving us with a polynomial whose coefficients are themselves polynomial in . Finally, we need to remove the lingering explicit instances of , and the key insight enabling this again stems from the fact that this is still a symmetric polynomial in despite our substitutions since
If we perform another permutation e.g.
and observe the resultant expression, we find that since our coefficient polynomials are symmetric, the only change is that and we now have in terms of :
Similarly, permuting gives us the same polynomial in :
It might seem gratuitous to have enumerated these permutations of , but now we have a system of equations of our polynomial with power sum coefficients in :
Summing over our equations, we get:
leaving us with an expression entirely in terms of power sums! Simplifying, we get
Conclusion
We've shown that, given a polynomial with roots , if we consider first the space of all expressions in the roots, then observe the special properties of the subset of expressions in those expressions which are symmetric in the roots, it is possible to express all symmetric polynomials in terms of the elementary power sum polynomials.
This is pertinent to Galois Theory since the former space of all expressions in the roots is known as the splitting field of in which Galois Theory leverages multiple symmetries of the roots which, up until this point, we've been treating as a binary property: a polynomial is symemtric or it isn't. Galois Theory compares cardinalities and "axes" of symmetries between expressions' roots.
Armed with a bit more intuition about the power of symmetry within polynomial expressions, claims such as " is more symmetric than " or " symmetric in a different way than " start to become less nebulous, even without understanding anything else about the object(s) being referred to.
The next post will unpack these relations in more depth.
References
Footnotes
Galois, Évariste. Letter from Galois to Auguste Chevalier dated May 1832. Journal de mathématiques pures et appliquées, 1836. ↩
Artin, Emil. "Galois Theory." Dover Publications, 1997. ↩ ↩2
Edwards, Harold M. "Galois Theory, 3rd Edition." Springer, 1984. ↩ ↩2
Évariste Galois Biography. mathshistory. ↩
Neumann, Peter M. "The mathematical writings of Évariste Galois." European Mathematical Scoiety, 2011. ↩