* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download MA 311 NUMBER THEORY BUTLER UNIVERSITY FALL 200 1
Survey
Document related concepts
Transcript
MA 311
NUMBER THEORY
BUTLER UNIVERSITY
FALL 2008
SCOTT T. PARSELL
1. Introduction
Number theory could be deο¬ned fairly concisely as the study of the natural numbers:
1, 2, 3, 4, 5, 6, . . . . We usually denote this set by β. The set of all integers (including 0 and
the negatives) is denoted by β€. Is there anything about the natural numbers thatβs worth
studying? It seems that we have a pretty good understanding of them once weβve learned
to count! Perhaps surprisingly, this turns out to be a rich and fascinating ο¬eld of study,
bursting with unsolved problems. A good starting point for our investigations is to look at
how the natural numbers factor.
Primes. A prime number is a number greater than 1 that cannot be written as the
product of two smaller natural numbers. The ο¬rst few primes are
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, . . . .
Integers exceeding 1 that are not prime are called composite. The primes are important
because each natural number greater than 1 can be written as a product of primes, and
this factorization is unique (up to the order of the factors). For example, 24 = 23 β
3 and
105 = 3 β
5 β
7. It is fairly easy to show that there are inο¬nitely many prime numbers; weβll
prove this in a later section. However, there remain many interesting unsolved (or partially
solved) questions about the primes and how they are distributed. For example,
β How precisely can we estimate the number of primes less than π₯? (We know that
π₯/ log π₯ gives a good ο¬rst approximation.) What about primes of the form 4π + 1, of
the form 4π + 3, etc.?
β Are there inο¬nitely many primes of the form π2 + 1? How about of the form 2π β 1?
Of the form 2π + 1?
β Is there an eο¬cient algorithm for ο¬nding a numberβs prime factorization or proving
that a number is prime? (The diο¬culty of factoring eο¬ciently is the basis of the
security of RSA encryption.)
β Are there inο¬nitely many pairs of βtwin primesβ, i.e., primes whose diο¬erence is two,
such as 3 and 5)? If not, can anything be said about small gaps between primes
asymptotically?
β (Goldbachβs problem) Can every even integer exceeding 2 be written as the sum of
two primes?
Questions about the distribution of primes usually fall under the heading of analytic
number theory because many of the techniques are based on real and complex analysis (i.e.,
mathematics related to calculus).
1
2
SCOTT T. PARSELL
Divisibility and congruences. Along with the idea of factoring integers comes the
notion of divisibility. We say that π divides π if there exists an integer π such that ππ = π.
For example, 4 divides 24 since 4 β
6 = 24, and 15 divides 105 since 15 β
7 = 105. Divisibility
leads to the important idea of congruences. We say that π is congruent to π modulo π if π
divides π β π. In this case, we write
πβ‘π
(mod π).
For example, 3 β‘ 75 (mod 24) and 8 β‘ 38 (mod 10). Arithmetic with congruences (sometimes called modular arithmetic) is useful for detecting certain types of periodic phenomena.
For example, one could use arithmetic mod 24 to keep track of the hour of day (in military
time) without regard to minutes, seconds, or day. One could use arithmetic mod 10 to keep
track of the last digit of a positive number (or mod 100 to keep track of the last two digits).
If π objects are arranged in a circle, then arithmetic mod π can be used to keep track of
the positions of the objects as they are rearranged. Weβll see some more interesting uses of
congruences later on. For instance, they can be used to construct check-digit schemes to
minimize errors in data entry. Facts about the computation of powers modulo π form the
basis for constructing an RSA cryptosystem.
Rings and ο¬elds. If one is doing arithmetic with congruences, say modulo 6, then
eο¬ectively there are only 6 distinct βnumbersβ to work with, usually denoted by 0, 1, 2, 3,
4, and 5. Under this scheme, the number 0 actually stands for the set
[0]6 = {. . . , β24, β18, β12, β6, 0, 6, 12, 18, 24, . . . }.
Similarly, 1 stands for [1]6 = {. . . , β17, β11, β5, 1, 7, 13, 19, . . . }, and so on. However, it is
convenient to pick one small integer (usually either the smallest positive integer or the one of
smallest absolute value) to represent each βcongruence classβ. The integers themselves are
an example of an abstract algebraic structure called a ring, which is basically a set equipped
with addition and multiplication operations satisfying basic properties like associativity and
the distributive law (we omit the precise deο¬nition of a ring here). The set of congruence
classes {0, 1, 2, 3, 4, 5} can be viewed as a ring in its own right, sometimes denoted by β€/6β€
or β€6 , with addition and multiplication deο¬ned modulo 6. For example, 2 + 5 = 1 and
2 β
3 = 0 in the ring β€6 .
One defect of rings is that multiplicative inverses do not exist in general. For example, 2
does not have a multiplicative inverse in β€, nor in β€6 . However, 2 does have a multiplicative
inverse in β€7 , since 2 β
4 = 1 under mod 7 arithmetic. Special rings in which all nonzero elements have multiplicative inverses (such as the rational numbers, real numbers, and complex
numbers) are called ο¬elds. It turns out that
β€π = {0, 1, 2, . . . , π β 1},
under arithmetic modulo π is a ο¬eld if and only if π is prime. Our algebra with congruences
will be inο¬uenced by these considerations. Just as the equation 2π₯ = 1 can be solved over
the rationals but not over the integers, the congruence 2π₯ β‘ 1 (mod π) can be solved when
π = 7 but not when π = 6 (in other words, the equation 2π₯ = 1 has a solution over β€7 but
not over β€6 ).
One can construct further examples of rings
β by βadjoiningβ irrational or complex numbers
to the set of integers. For example if π = β1, then the set β€[π] of all complex numbers
of the form π + ππ, where π and π are integers, forms a ring, known as the ring of Gaussian
MA 311
NUMBER THEORY
FALL 2008
3
integers. One can ask whether such a ring has any number-theoretic properties in common
with the integers, such as unique factorization. It turns out that this ring does have unique
factorization, but not all the integer primes remain prime in β€[π]. For instance, 2 = (1 +
π)(1 β π), but 3 remains irreducible. The numbers 1 + π and 1 β π are primes in β€[π], and the
number 6 has the
βunique prime factorization 6 = (1 + π) β
(1 β π) β
3.
If we let πΏ = β5, then we can construct the ring β€[πΏ], which is the set of all complex
numbers of the form π + ππΏ, where π and π are integers. Something bizarre happens when
we try to factor 6 in this ring. We obviously have
6=2β
3
and
6 = (1 + πΏ)(1 β πΏ),
and one can show that 2, 3, 1 + πΏ, and 1 β πΏ are all irreducible in the ring β€[πΏ]. Thus we have
two diο¬erent factorizations for 6, which means that unique factorization fails in this ring!
The study of primes and factorization in rings such as β€[π] and β€[πΏ] forms the basis for
much of algebraic number theory. Here one makes heavy use of general results from modern
algebra, so we wonβt pursue this branch of the subject very deeply.
Diophantine equations. One area of number theory that we hope to touch on later in
the course overlaps with both analytic and algebraic number theory. A diophantine equation
is simply an equation (usually a polynomial in two or more variables) for which we seek
integer (or sometimes rational) solutions; a classic example is the equation
π₯2 + π¦ 2 = π§ 2 .
This equation has many integer solutions, such as (3, 4, 5) and (5, 12, 13). In fact, it can
be shown that there are inο¬nitely many integer solutions, and all the solutions can be described by an explicit parametrization. These are the so-called Pythagorean triples, which
correspond to the lengths of the sides in right triangles. Interestingly, things become dramatically diο¬erent if we change the equation to π₯3 + π¦ 3 = π§ 3 . Here the only integer solutions
are the βtrivialβ ones with π₯π¦π§ = 0. In fact, Fermatβs Last Theorem asserts that if π is any
integer exceeding 2 then the diophantine equation π₯π + π¦ π = π§ π has only trivial solutions.
This seemingly innocent conjecture remained unproven for over 300 years until deep work of
Wiles resolved it in 1995.
As another example, consider the diophantine equation π¦ 2 = π₯3 +17. This is an example of
an elliptic curve, which more generally has the form π¦ 2 = π (π₯), where π is a cubic polynomial.
It turns out that the rational points lying on such a curve have an additive group structure,
and this can be used as the basis for an encryption scheme and also for an eο¬cient factoring
algorithm. Wiles also exploited connections with elliptic curves in his proof of Fermatβs Last
Theorem. All this work on diophantine equations in few variables uses primarily algebraic
techniques, so the detailed study of these topics is best left for a more advanced course.
A theorem of Lagrange states that every positive integer can be expressed as the sum of
four perfect squares. In other words, the diophantine equation
π₯21 + π₯22 + π₯23 + π₯24 = π
can be solved for every positive integer π. For instance, when π = 31 we can take π₯1 = 5,
π₯2 = 2, π₯3 = 1, and π₯4 = 1. A generalization of this question known as Waringβs problem
asks what happens with higher powers. For instance, how large does π have to be in order
to represent all integers as sums of π perfect cubes? (The answer turns out to be 9.) What if
we only need to represent all suο¬ciently large integers? Here we know that 7 cubes suο¬ce,
4
SCOTT T. PARSELL
but itβs conjectured that 4 would be enough! The type of diophantine equation involved in
Waringβs problem typically has enough variables that it can be attacked by analytic methods,
and this has been a very active area of research over the past 20 years. Weβll discuss some
of the underlying ideas later in the course.
In Waringβs problem, one could also ask what happens if the variables are restricted to
be primes. For example, the Goldbach problem mentioned earlier amounts to solving the
equation π1 + π2 = π in primes π1 and π2 for every even π > 2. The general Waring-Goldbach
problem considers the solubility of the diophantine equation
ππ1 + β
β
β
+ πππ = π
in primes π1 , . . . , ππ for every π for which the underlying congruences are feasible.
A variation known as a diophantine inequality arises when attempting to approximate
irrational
number
βnumbers by rational numbers. For instance, if we want to ο¬nd a rational
β
close to 2, then we are looking for integer solutions to the inequality β£π₯/π¦ β 2β£ < π, where
π is a small positive number. Dirichletβs theorem on diophantine approximation actually
tells us that we can solve this inequality with π replaced by an explicit function
of the
β
denominator, namely 1/π¦ 2 . Thus we can solve the diophantine inequality β£π₯ β 2π¦β£ < 1/π¦.
More general inequalities (for example, involving sums of πth powers) are a subject of current
research interest.
Where do we begin? Weβve only scratched the surface of number theory by mentioning
some of the important ideas and some of the interesting unsolved problems. In the next
section, weβll start laying the foundations for our study by developing some actual machinery
on divisibility, primes, and congruences. This will lead us to our ο¬rst main goal, which is
to understand RSA cryptography. Following that, we hope to touch on some of the more
advanced topics mentioned above, such as the distribution of primes, the algebraic structure
of β€π , Waringβs problem, and diophantine approximation.
2. Divisibility
Recall that if π, π β β€, we say that π divides π (and write πβ£π) if there exists π β β€ such
that π = ππ. For example, 2 divides 6, but 4 does not divide 6. When π divides π, we say
that π is a multiple of π and that π is divisible by π. Two easy properties of divisibility that
weβll ο¬nd useful are given in the following lemma.
Lemma 2.1. Let π, π, and π be integers.
(a) If πβ£π and πβ£π, then πβ£π.
(b) If πβ£π and πβ£π, then πβ£(ππ + ππ‘) for all integers π and π‘.
Proof. If πβ£π and πβ£π, then we can write π = ππ and π = ππ for some integers π and π. We
then have π = π(ππ), which shows that πβ£π. Similarly, if πβ£π and πβ£π, then we can write
π = ππ and π = ππ for some integers π and π. If π and π‘ are arbitrary integers, we have
ππ + ππ‘ = πππ + πππ‘ = π(ππ + ππ‘), which shows that πβ£(ππ + ππ‘).
β‘
The following divisibility exercise gives us a chance to review proof by mathematical
induction.
Example 2.2. Prove that π5 β π is divisible by 5 for every positive integer π.
MA 311
NUMBER THEORY
FALL 2008
5
Solution. We proceed by induction on π. First of all, we have 15 β 1 = 0, which is clearly
divisible by 5, since 0 = 5 β
0. This establishes the base case. Now suppose that π β₯ 1 is an
integer and that π5 β π is divisible by 5. Then by the binomial theorem one has
(π + 1)5 β (π + 1) = π5 + 5π4 + 10π3 + 10π2 + 5π + 1 β π β 1
= (π5 β π) + 5(π4 + 2π3 + 2π2 + π).
Here the ο¬rst term on the right is divisible by 5 according to the induction hypothesis, and
the second term is clearly divisible by 5 since π4 + 2π3 + 2π2 + π is an integer. We therefore
deduce from part (b) of Lemma 2.1 that (π + 1)5 β (π + 1) is divisible by 5, and the result
now follows by induction.
β‘
In the future, we will not always be quite so pedantic in writing, but the above solution
serves as a good model for constructing proofs of this type. In general, to prove that a
statement π (π) holds for all positive integers π, one must ο¬rst establish π (1) and then prove
the implication π (π) =β π (π + 1). This principle is one of the fundamental axioms
about the integers. It is equivalent to the well-ordering principle, which states that every
non-empty subset of the positive integers has a smallest element.
Greatest common divisors. The greatest common divisor of π and π is the largest
positive integer that divides both π and π. It is denoted by gcd(π, π), or sometimes just
(π, π) when there is no danger of confusion with an ordered pair. For example, gcd(4, 6) = 2,
gcd(12, 51) = 3, and gcd(9, 16) = 1. If gcd(π, π) = 1, then we say that π and π are relatively
prime (or coprime). We note that gcd(π, 0) = π for every non-zero integer π and that gcd(0, 0)
is undeο¬ned. The least common multiple of π and π is the smallest positive integer that is a
multiple of both π and π. It is denoted by lcm(π, π) or [π, π]. For example, lcm(4, 6) = 12. It
is fairly easy to see that
gcd(π, π)lcm(π, π) = ππ.
When π and π are small, one can compute gcd(π, π) fairly easily by looking at the prime
factorizations of π and π and picking out the parts in common. For instance, 24 = 23 β
3
and 180 = 22 β
32 β
5, so gcd(24, 180) = 22 β
3 = 12. However, since factoring is expensive
computationally, this is not an eο¬cient method when π and π are large. A better method is
based on the division with remainder algorithm learned in grade school.
Theorem 2.3. (Division with remainder) For any integers π and π with π > 0, there
exist unique integers π and π such that
π = ππ + π
and
0 β€ π < π.
Proof. We ο¬rst prove the existence of π and π. Consider the list of integers
. . . π β 3π, π β 2π, π β π, π, π + π, π + 2π, π + 3π, . . . .
Since π > 0, we can select one with the smallest non-negative value, say π = π β ππ. If π β₯ π,
then we ο¬nd that
π β π = π β ππ β π = π β (π + 1)π
is a non-negative number on our list with a smaller value than π, which contradicts our choice
of π. Thus we have 0 β€ π < π and π = ππ + π.
To check uniqueness, suppose there are integers π1 , π2 , π1 , and π2 with
π = π1 π + π1 = π2 π + π2
and
0 β€ π1 , π2 < π.
6
SCOTT T. PARSELL
Then we have π(π1 β π2 ) = π2 β π1 , and we may suppose without loss of generality that
π1 β€ π2 . Then
0 β€ π2 β π1 < π β π1 β€ π,
and hence
0 β€ π(π1 β π2 ) < π,
which implies that π1 β π2 = 0. Thus π1 = π2 , and it follows that π1 = π2 .
β‘
For example, if π = 48 and π = 9, then we can write 48 = 5 β
9 + 3, so we can take π = 5
and π = 3 in Theorem 2.3. We call π the quotient and π the remainder. Notice that π = 0 if
and only if π divides π.
Theorem 2.4. Let π and π be nonzero integers. Then gcd(π, π) is the smallest positive
integral linear combination of π and π. That is, gcd(π, π) is the smallest positive value of
ππ + ππ‘, where π and π‘ are integers.
Proof. By taking π = π and π‘ = π, we see that positive integral linear combinations exist, so
we can let π denote the smallest such value. Write π = ππ 0 + ππ‘0 . By Theorem 2.3, we can
write
π = ππ + π = π(ππ 0 + ππ‘0 ) + π,
where
0 β€ π < π.
Solving for π, we get
π = π(1 β ππ 0 ) + π(βππ‘0 ),
so π is an integral linear combination of π and π, and since π < π, the minimality of π implies
that π = 0. Thus we see that π divides π, and we can apply a similar argument to deduce
that π divides π. Thus π is a common divisor of π and π. Moreover, if π is any common
divisor of π and π, then π divides both ππ 0 and ππ‘0 , so π divides π. Thus we conclude that
π = gcd(π, π).
β‘
Corollary 2.5. The integers π and π are relatively prime if and only if there exist integers
π and π‘ such that ππ + ππ‘ = 1.
Proof. If gcd(π, π) = 1, then it follows from Theorem 2.4 that ππ + ππ‘ = 1 for some integers
π and π‘. Conversely, suppose that 1 can be expressed as a linear combination of π and π.
Since Theorem 2.4 ensures that gcd(π, π) is the smallest positive integer with this property,
we may conclude that gcd(π, π) = 1.
β‘
For example, we have 9 β
(β7) + 16 β
4 = 1, which shows that gcd(9, 16) = 1. An eο¬cient
algorithm for computing gcd(π, π) is based on the following simple result.
Lemma 2.6. If π = ππ + π, then gcd(π, π) = gcd(π, π).
Proof. If π divides both π and π, then π clearly divides π = π β ππ, so π is a common divisor
of π and π. Conversely, if π divides both π and π, then π clearly divides π = ππ + π, so π is a
common divisor of π and π. Therefore the set of common divisors of π and π is identical to
the set of common divisors of π and π, so the greatest common divisors must be equal. β‘
The Euclidean Algorithm. We can compute the greatest common divisor very eο¬ciently by successively applying Theorem 2.3 and Lemma 2.6. The gcd is the last non-zero
MA 311
NUMBER THEORY
FALL 2008
7
remainder in this process. That is, to compute gcd(π, π), we write
π = ππ1 + π1
(0 < π1 < π)
π = π 1 π2 + π 2
(0 < π2 < π1 )
π1 = π2 π3 + π3
...
(0 < π3 < π2 )
ππβ2 = ππβ1 ππ + ππ
ππβ1 = ππ ππ+1 ,
(0 < ππ < ππβ1 )
so that gcd(π, π) = ππ .
Example 2.7. Use the Euclidean algorithm to compute π = gcd(630, 132), and ο¬nd integers
π and π‘ such that π = 630π + 132π‘.
Solution. We have
630 = 132 β
4 + 102
132 = 102 β
1 + 30
102 = 30 β
3 + 12
30 = 12 β
2 + 6
12 = 6 β
2,
so the algorithm terminates with π = 4, and we have gcd(630, 132) = π4 = 6. We can now
work backwards through these equations to ο¬nd the required integers π and π‘. We have
6 = 30 β 12 β
2
= 30 β (102 β 30 β
3) β
2
= 30 β
7 β 102 β
2
= (132 β 102) β
7 β 102 β
2
= 132 β
7 β 102 β
9
= 132 β
7 β (630 β 132 β
4) β
9
= 132 β
43 β 630 β
9,
so we can take π = β9 and π‘ = 43.
β‘
There is another way to organize the computations in the Euclidean algorithm that produces gcd(π, π) and the integers π and π‘ simultaneously. The idea is to set up an augmented
matrix consisting of a 2 × 2 identity matrix, followed by π and π in the third column. One
then subtracts one a multiple of one row from the other until the entries in the third column divide one another. The multiples we use are exactly the quotients π1 , π2 , . . . , ππ . Thus
Example 2.7 could be handled as follows:
]
[
]
[
]
[
1 β4 102
1 β4 102
1 0 630
β
β
0 1 132
0
1 132
β1
5 30
]
[
]
[
4 β19 12
4 β19 12
β
.
β
β1
5 30
β9
43 6
8
SCOTT T. PARSELL
Every row [π₯ π¦ β£ π§] of every matrix in this computation has the property that 630π₯ + 132π¦ =
π§, because this is satisο¬ed by the initial matrix and is preserved by the row operations.
Therefore, the required integers π and π‘ appear to the left of gcd(π, π) in the ο¬nal matrix.
In the worst case, the Euclidean algorithm takes on the order of log π steps to compute
gcd(π, π), where π = max(β£πβ£, β£πβ£). The function log π grows very slowly as π β β, so the
algorithm runs very quickly on a computer.
Primes. Recall that an integer π > 1 is said to be prime if its only positive factors
are 1 and π. One can generate all the primes up to π using the Sieve of Eratosthenes to
successively strike out all the proper multiples
β of 2, 3, 5, etc. If an integer less than π isβnot
prime, then it has a prime divisor less than π , so one can terminate this process at π .
The integers that remain uncrossed are the primes up to π .
Lemma 2.8. (Euclidβs Lemma) Let π and π be integers, and let π be a prime. If πβ£ππ,
then πβ£π or πβ£π.
Proof. Suppose that π divides ππ but that π does not divide π. Since π is prime, we must
have gcd(π, π) = 1, so by Theorem 2.4 there exist integers π and π‘ such that ππ + ππ‘ = 1.
Multiplying through by π, we obtain
πππ + πππ‘ = π.
Since πβ£ππ and πβ£π, we deduce from part (b) of Lemma 2.1 that πβ£π.
β‘
Note that Lemma 2.8 fails if π is not prime. For example, 6β£12 = 3β
4, but 6 does not divide
3 or 4. One can easily show by induction that Lemma 2.8 can be extended to products of
more than two integers. That is, if π is a prime dividing the product π1 β
β
β
ππ , then π must
divide at least one of the ππ .
As a simple application of Euclidβs Lemma, we perform the following entertaining exercise.
β
Example 2.9. Prove that 2 is irrational.
β
β
Solution. We proceed by contradiction. If 2 were rational, then we could write 2 = π/π
for some positive integers π and π with (π, π) = 1. After squaring both sides and clearing
denominators, we ο¬nd that 2π2 = π2 , and hence in particular that 2β£π2 . Since 2 is prime,
it now follows from Euclidβs Lemma that 2β£π, so we can write π = 2π for some integer π.
Substituting this into our previous equation yields 2π2 = 4π2 , or π2 = 2π2 . Thus 2β£π2 and
hence by Euclidβs Lemma we have 2β£π. We have now deduced that both π and π are divisible
by 2, contradicting
our original assumption that (π, π) = 1. This contradiction forces us to
β
conclude that 2 is in fact irrational.
β‘
β
Note that there is little diο¬culty in generalizing the argument to handle π, where π is any
β
prime. In fact it is not hard to see that π is irrational if and only if π fails to be a perfect
square, but this requires information about factoring composite integers. The following
result is the most important application of Euclidβs Lemma and, as its name suggests, is
fundamental to our study of number theory.
Theorem 2.10. (Fundamental Theorem of Arithmetic) Every integer π > 1 can be
written as a product of prime factors, and this factorization is unique up to the order of the
factors.
MA 311
NUMBER THEORY
FALL 2008
9
Proof. The existence of factorizations follows easily by induction on the size of the integer π.
For the base case, it suο¬ces to note that π = 2 is prime. Now suppose that π β₯ 2 and that
every integer π with 2 β€ π β€ π β 1 has a factorization into primes. If π is prime, then we are
done. Otherwise, we may write π = ππ where 2 β€ π, π β€ π β 1, and the induction hypothesis
shows that π and π both have factorizations, which combine to produce a factorization of π.
To prove uniqueness, we induct on the number of factors. Suppose that
π = π1 β
β
β
ππ = π1 β
β
β
ππ ,
where the ππ and ππ are primes, and we may assume without loss of generality that π β€ π . If
π = 1, then clearly π = 1, so π1 = π1 . Now let π > 1, and suppose that unique factorization
holds for all integers with fewer than π prime factors. Since π1 β£π1 β
β
β
ππ , we have π1 β£ππ (and
hence π1 = ππ ) for some π by an easy extension of Euclidβs Lemma. By relabeling, we may
suppose that π = 1, and hence we may divide through by π1 to get
π2 β
β
β
ππ = π2 β
β
β
ππ .
The induction hypothesis now implies that π = π and that π2 , . . . , ππ is a permutation of
π2 , . . . , ππ , and the uniqueness follows.
β‘
β
In rings where unique factorization fails, like β€[ β5], the problem is that the notions
of βirreducibleβ and βprimeβ do not correspond. The property in Lemma 2.8 is used as
the deο¬nition of prime, but there are
β irreducible elements that donβt satisfy this property.
For example,
2
is
irreducible
in
β€[
β5], but it is not
β
β
β prime in this
β ring because 2 divides
6 = (1 + β5)(1 β β5), but 2 does not divide 1 + β5 or 1 β β5
Theorem 2.11. There are inο¬nitely many primes.
Proof. Assume to the contrary that there are only ο¬nitely many primes, say π1 , π2 , . . . , ππ ,
and let
π = π1 π2 β
β
β
ππ + 1.
We know from Theorem 2.10 that π has at least one prime factor, say π. We cannot have
π = ππ for some π because this would imply that π divides 1 = π β π1 π2 β
β
β
ππ . This is a
contradiction, so we conclude that there must be inο¬nitely many primes.
β‘
This theorem was ο¬rst proved by Euclid, and weβve given his original proof. Many other
proofs have been discovered since Euclidβs time. A more general theorem of Dirichlet states
that there are inο¬nitely many primes of the form π = ππ + π whenever π and π are relatively
prime. For example, there are inο¬nitely many primes of the form π = 4π + 1 and also of the
form π = 4π + 3. A weak version of the prime number theorem states that if π(π₯) denotes
the number of primes up to π₯, then π(π₯) βΌ π₯/ log π₯ asymptotically, in the sense that
π(π₯)
= 1.
π₯ββ π₯/ log π₯
lim
One could interpret this by saying that the probability that the integer π₯ is prime is roughly
1/ log π₯. Throughout these notes log π₯ denotes the natural (base π) logarithm.
Theorem 2.12. There are arbitrarily large gaps between consecutive primes.
10
SCOTT T. PARSELL
Proof. Given an integer π > 1, weβll construct a list of π consecutive composite numbers. If
we let π = (π + 1)! + 2, then the π numbers
π, π + 1, π + 2, . . . , π + π β 1
are all composite, since π + 2 divides π + π = (π + 1)! + (π + 2) for π = 0, 1, 2 . . . , π β 1. β‘
At the other extreme, the Twin Primes Conjecture states that there are inο¬nitely pairs of
primes whose diο¬erence is 2, for instance
(3, 5), (5, 7), (11, 13), (17, 19), (29, 31), (41, 43), . . . .
Those familiar with analysis may wish to observe that if ππ denotes the πth prime then
Theorem 2.12 is equivalent to the statement that lim sup(ππ+1 β ππ ) = β, while the Twin
Primes Conjecture asserts that lim inf(ππ+1 β ππ ) = 2. In spite of some recent breakthroughs
in this area, we do not even know for sure that lim inf(ππ+1 β ππ ) < β. This indicates that
weβre not very close to a proof of the Twin Primes Conjecture!
Perfect numbers and Mersenne primes. A positive integer is said to be perfect if
it is the sum of its proper positive divisors (that is, not including the number itself). For
example,
6=1+2+3
and
28 = 1 + 2 + 4 + 7 + 14
are perfect. The ο¬rst few perfect numbers are 6, 28, 496, 8128, 33550336. It is believed that
there are inο¬nitely many perfect numbers, but this is not known. Another open problem is
to determine whether there are any odd perfect numbers (itβs believed that the answer is
no).
Theorem 2.13. A positive even integer π is perfect if and only if we can write π =
2πβ1 (2π β 1), where 2π β 1 is prime.
Proof. First suppose that π = 2π β 1 is prime. We need to show that π = 2πβ1 π is perfect.
The proper positive divisors of π are
1, 2, 4, 8, . . . , 2πβ1 , π, 2π, 4π, 8π, . . . , 2πβ2 π,
so their sum is
2π β 1 + π(2πβ1 β 1) = π + (2πβ1 β 1)π = 2πβ1 π = π.
This shows that π is perfect.
Conversely, suppose that π is an even perfect number. We need to show that there is an
integer π such that π = 2πβ1 (2π β 1) and 2π β 1 is prime. Since π is even, we can write
π = 2π π‘, where π β₯ 1 and π‘ is odd. Let π denote the sum of all the positive divisors of π‘
(i.e., the sum of the odd positive divisors of π). Since π is perfect, we know that the sum
of all the positive divisors of π is equal to 2π, so we have have
2π = π + 2π + 4π + 8π + β
β
β
+ 2π π = (2π+1 β 1)π,
and thus
π=
2π
2π+1 π‘
(2π+1 β 1)π‘ + π‘
π‘
=
=
=
π‘
+
.
2π+1 β 1
2π+1 β 1
2π+1 β 1
2π+1 β 1
MA 311
NUMBER THEORY
FALL 2008
11
Since π and π‘ are integers, we see that π’ = π‘/(2π+1 β 1) is an integer, and π’ < π‘ since π β₯ 1.
Thus π’ and π‘ are two distinct divisors of π‘. It follows that they are the only positive divisors
of π‘, whence π‘ is prime and π’ = 1. Thus we have π‘ = 2π+1 β 1, so on setting π = π + 1 we get
π = 2πβ1 π‘ = 2πβ1 (2π β 1),
where 2π β 1 is prime.
β‘
Primes of the form 2π β 1 are called Mersenne primes. As a result of Theorem 2.13, ο¬nding
even perfect numbers is equivalent to ο¬nding Mersenne primes. Notice that 6 = 21 β
(22 β 1),
28 = 22 (23 β 1), 496 = 24 (25 β 1), 8128 = 26 (27 β 1), and 33550336 = 212 (213 β 1).
The following theorem restricts the possibilities somewhat.
Theorem 2.14. If 2π β 1 is prime, then π is prime.
Proof. We prove the contrapositive. Suppose that π is composite. Then we can write π = ππ
for some integers π and π with 1 < π, π < π. Then we have
2π β 1 = 2ππ β 1 = (2π )π β 1 = (2π β 1)(1 + 2π + 22π + β
β
β
+ 2(πβ1)π ).
Here we have used the factorization
π₯π β 1 = (π₯ β 1)(1 + π₯ + π₯2 + β
β
β
+ π₯πβ1 )
with π₯ = 2π . Since 1 < π < π, we have 1 < 2π β 1 < 2π β 1, and hence we conclude that
2π β 1 is composite.
β‘
The converse of Theorem 2.14 is false. That is, there exist primes π for which 2π β 1 is
not prime. The smallest example is 211 β 1 = 2047 = 23 β
89. There are 46 known Mersenne
primes, the largest of which is 243,112,609 β 1. This was discovered in August 2008 and has
12,978,189 digits. The largest known perfect number is therefore 243,112,608 (243,112,609 β 1).
This world-record prime was actually the 45th Mersenne prime to be discovered. The 46th
one was found about two weeks later but has only 11,185,272 digits. To join the Great
Internet Mersenne Prime Search (GIMPS), go to http://www.mersenne.org.
3. Congruences
Let π be a positive integer, and let π and π be arbitrary integers. We say that π and π are
congruent modulo π if π divides π β π. In this case, we write
πβ‘π
(mod π).
For example, we have 37 β‘ 2 (mod 5), 37 β‘ β3 (mod 5), and 24 β‘ 0 (mod 6). Notice
that π β‘ 0 (mod π) if and only if πβ£π and that π β‘ π (mod π) if and only if we can write
π = π + ππ for some integer π.
Lemma 3.1. If π β‘ π (mod π) and π β‘ π (mod π), then π + π β‘ π + π (mod π) and ππ β‘ ππ
(mod π).
Proof. Suppose that π β‘ π (mod π) and π β‘ π (mod π). Then there exist integers π and π
such that π = π + ππ and π = π + ππ. We then have
π + π = π + π + (π + π)π
and
ππ = ππ + (ππ + ππ + πππ)π,
which shows that π + π β‘ π + π (mod π) and ππ β‘ ππ (mod π).
β‘
This lemma allows us to manipulate congruences algebraically as we do with equations.
12
SCOTT T. PARSELL
Example 3.2. For what integers π₯ does the congruence 4π₯ + 1 β‘ 3 (mod 7) hold?
Solution. Subtracting 1 from both sides shows that the congruence is equivalent to 4π₯ β‘ 2
(mod 7). Multiplying both sides by 2 now gives 8π₯ β‘ 4 (mod 7), which is the same as π₯ β‘ 4
(mod 7), since 8 β‘ 1 (mod 7). Hence the congruence is satisο¬ed by all integers π₯ of the form
π₯ = 4 + 7π, where π is an integer.
β‘
Lemma 3.3. (Cancellation) If ππ β‘ ππ (mod π) and (π, π) = 1, then π β‘ π (mod π).
Proof. Suppose that ππ β‘ ππ (mod π) and (π, π) = 1. Then π divides ππ β ππ = π(π β π).
Since (π, π) = 1, it follows by imitating the proof of Euclidβs Lemma that π divides π β π
(exercise). Thus we have π β‘ π (mod π).
β‘
Note that Lemma 3.3 may fail without the assumption that (π, π) = 1. For instance, we
have 2 β
5 β‘ 2 β
14 (mod 6), but 5 ββ‘ 14 (mod 6).
Example 3.4. For what values of π₯ does the congruence 4π₯ + 1 β‘ 5 (mod 7) hold?
Solution. Here the congruence is equivalent to 4π₯ β‘ 4 (mod 7), and since (4, 7) = 1 we
may apply Lemma 3.3 to conclude that π₯ β‘ 1 (mod 7). Hence the congruence holds for all
integers π₯ of the form π₯ = 1 + 7π, where π is an integer.
β‘
Residue Classes. It is easy to see that congruence modulo π deο¬nes an equivalence
relation on the set of integers and therefore partitions the integers into equivalence classes.
Our solutions to Examples 3.2 and 3.4 indicate how these are deο¬ned. In Example 3.4, for
instance, the solution was the set of all integers congruent to 1 modulo 7, that is, all integers
π₯ that can be expressed in the form π₯ = 1 + 7π for some integer π. We call this set the
residue class of 1 modulo 7. It is sometimes denoted by [1] or [1]7 . Thus
[1]7 = {. . . , β20, β13, β6, 1, 8, 15, 22, . . . }.
Similarly, the solution of Example 3.2 is the set of all integers in the residue class
[4]7 = {. . . , β17, β10, β3, 4, 11, 18, . . . }.
In general, we let [π] or [π]π denote the residue class of π modulo π, which is deο¬ned to be
the set of all integers of the form π + ππ, where π β β€.
It is often convenient to view each residue class as a single element in a number system.
Therefore, we let β€π denote the set of residue classes modulo π. Technically, we have
β€π = {[0]π , [1]π , [2]π , . . . , [π β 1]π },
but Lemma 3.1 allows us to work with any set of representatives, such as {0, 1, 2, . . . , π β 1},
when doing computations. Thus we often dispense with the brackets and just think of β€π
as the set {0, 1, 2, . . . , π β 1} under mod π arithmetic. With this viewpoint, we could say
that the congruence in Example 3.4 has the unique solution π₯ = 1 in β€7 . Addition and
multiplication in β€7 can be represented by the following tables:
MA 311
+
0
1
2
3
4
5
6
0
0
1
2
3
4
5
6
NUMBER THEORY
1
1
2
3
4
5
6
0
2
2
3
4
5
6
0
1
3
3
4
5
6
0
1
2
4
4
5
6
0
1
2
3
5
5
6
0
1
2
3
4
6
6
0
1
2
3
4
5
×
0
1
2
3
4
5
6
FALL 2008
0
0
0
0
0
0
0
0
1
0
1
2
3
4
5
6
2
0
2
4
6
1
3
5
3
0
3
6
2
5
1
4
4
0
4
1
5
2
6
3
5
0
5
3
1
6
4
2
13
6
0
6
5
4
3
2
1
A set such as {0, 1, 2, . . . , πβ1} that contains exactly one representative of each equivalence
class is called a complete residue system modulo π. Complete residue systems are not unique;
for instance {0, 1, 2, 3, 4, 5, 6} and {β3, β2, β1, 0, 1, 2, 3} are equally valid complete residue
systems modulo 7, and either one could be used to represent β€7 .
Solving Linear Congruences. We want to develop a systematic procedure for ο¬nding
the solutions of a congruence of the shape ππ₯ β‘ π (mod π). The following lemma is an
important starting point.
Lemma 3.5. (Multiplicative Inverses) If (π, π) = 1, then there is an integer π such that
ππ β‘ 1 (mod π). Moreover, the residue class of π modulo π is unique.
Proof. Since (π, π) = 1, we know from Corollary 2.5 that there exist integers π and π‘ with
ππ + ππ‘ = 1. We then have ππ = 1 β ππ‘, which shows that ππ β‘ 1 (mod π), so we can take
π = π . Now suppose that πβ² is any other integer with πβ² π β‘ 1 (mod π). Then
πβ² β‘ πβ² (ππ) β‘ (πβ² π)π β‘ π (mod π),
and the uniqueness claim follows.
β‘
If ππ β‘ 1 (mod π), then we say that π is the inverse of π modulo π, and we sometimes
write π = πβ1 or π = πβ1 mod π. Lemma 3.5 shows that when (π, π) = 1, the congruence
ππ₯ β‘ π (mod π) has a unique solution in β€π , given by π₯ = πβ1 π.
In view of Corollary 2.5, it is easy to see that Lemma 3.5 can be strengthened to an βif and
only ifβ statement. That is, π has a multiplicative inverse modulo π if and only if (π, π) = 1.
In order to ο¬nd πβ1 when (π, π) = 1, we apply the Euclidean algorithm to ο¬nd integers π
and π‘ with
ππ + ππ‘ = 1.
β1
We then have ππ β‘ 1 (mod π), so π β‘ π (mod π). For small values of π, we can often ο¬nd
inverses by inspection without resorting to the Euclidean algorithm.
Example 3.6. Solve the congruence 4π₯ β‘ 3 (mod 9).
Solution. Since (4, 9) = 1 we know that 4 has a multiplicative inverse modulo 9, and we ο¬nd
by inspection that 4β1 = 7 in β€9 since 4 β
7 = 28 β‘ 1 (mod 9). Multiplying through by 7
now gives π₯ β‘ 21 β‘ 3 (mod 9), and hence π₯ = 3 is the unique solution in β€9 .
β‘
Example 3.7. Solve the congruence 91π₯ β‘ 5 (mod 64).
Solution. We can start by observing that 91 β‘ 27 (mod 64), so the congruence is equivalent
to 27π₯ β‘ 5 (mod 64). Since (27, 64) = 1, we can again ο¬nd a unique solution modulo 64 by
14
SCOTT T. PARSELL
multiplying through 27β1 , but ο¬nding the inverse by inspection is not quite as easy as it was
in Example 3.6. Thus we apply the Euclidean algorithm:
[
]
[
]
[
]
1 0 64
1 β2 10
1 β2 10
β
β
0 1 27
0
1 27
β2
5 7
[
]
[
]
3 β7 3
3 β7 3
β
β
.
β2
5 7
β8 19 1
This shows that 64 β
(β8) + 27 β
19 = 1 and hence that 27 β
19 β‘ 1 (mod 64). Hence we have
27β1 = 19 in β€64 . Thus π₯ β‘ 5 β
19 β‘ 31 is the unique solution modulo 64.
β‘
What, if anything, can we say about the solutions to the congruence ππ₯ β‘ π (mod π)
when (π, π) > 1? The following theorem provides the answer.
Theorem 3.8. Write π = (π, π). The congruence ππ₯ β‘ π (mod π) has a solution if and
only if π divides π. In this case, there are exactly π solutions modulo π, spaced π/π apart.
Proof. If π₯ is a solution to the congruence, then we have ππ₯ = π + ππ for some integer π,
and thus π = ππ₯ β ππ. Since πβ£π and πβ£π, we must have πβ£π by Lemma 2.1. Therefore the
congruence has no solution if π does not divide π.
Now suppose that πβ£π. Then since ππ₯ β π = ππ if and only if ππ π₯ β ππ = π ππ , we see that the
congruence is equivalent to
π
π
π
π₯β‘
(mod ).
π
π
π
Since (π/π, π/π) = 1, Lemma 3.5 shows that there is a unique solution π₯0 modulo π/π and
hence π distinct solutions modulo π, given by π₯ = π₯0 + π(π/π) for 0 β€ π β€ π β 1.
β‘
Example 3.9. Describe the solutions of the congruence 6π₯ β‘ 5 (mod 9).
Solution. We have (6, 9) = 3, which fails to divide 5, so Theorem 3.8 tells us that there is
no solution.
β‘
Example 3.10. Describe the solutions of the congruence 24π₯ β‘ 9 (mod 33).
Solution. We have (24, 33) = 3, which divides 9, so the proof of Theorem 3.8 shows that the
congruence is equivalent to 8π₯ β‘ 3 (mod 11). Since 8β1 = 7 in β€11 , we ο¬nd that π₯ = 10
is the unique solution modulo 11. It follows that there are exactly 3 solutions modulo 33,
represented by the residue classes π₯ = 10, π₯ = 21, and π₯ = 32.
β‘
Applications to check digit schemes. Congruences can be used to construct a method
for reducing errors in data entry. Suppose we have a list of 9-digit identiο¬cation numbers
of the form π₯1 π₯2 . . . π₯9 to enter into a computer. We can add a 10th digit π₯10 satisfying the
congruence
π₯10 β‘ π₯1 + β
β
β
+ π₯9 (mod 10);
that is, π₯10 is the sum of the previous 9 digits modulo 10. We can now enter our ID numbers
in the form π₯1 π₯2 . . . π₯10 and program our computer to reject our entry if the above congruence
is not satisο¬ed. For example, the number 129-28-5468 would be entered as 129-28-5468-5
The number π₯10 (in this case 5) is called a check digit. This scheme will catch any errors
in which only a single digit is mistyped; for instance, the erroneous entry 126-28-5468-5 for
the ID number above would be rejected. Many other errors will be caught as well, and this
MA 311
NUMBER THEORY
FALL 2008
15
scheme can be applied to data strings of any length. One notable disadvantage is that it does
not detect errors in which two digits are interchanged; for example, the entry 129-28-4568-5
would be accepted by our computer as a valid ID even though it may have resulted from
mistyping 54 as 45.
In order to detect errors resulting from interchanging digits, one can employ a more sophisticated scheme. We illustrate by examining the International Standard Book Number
(ISBN) system. These numbers are 10 digits long and come in 4 blocks; for instance, the
ISBN for Niven, Zuckerman, and Montgomery, Introduction to the Theory of Numbers, 5th
edition, is 0-471-62546-9. The ο¬rst digit indicates the country of publication, the second
block encodes the publisher (Wiley), the third block identiο¬es the title and edition, and the
fourth block is a check digit. If the ο¬rst nine digits are π₯1 , . . . , π₯9 , then the check digit π₯10 is
determined by the congruence
π₯10 β‘
9
β
ππ₯π β‘ π₯1 + 2π₯2 + 3π₯3 + β
β
β
+ 9π₯9
(mod 11).
π=1
Thus in the above case, we would compute
π₯10 β‘ 0 + 2 β
4 + 3 β
7 + 4 β
1 + 5 β
6 + 6 β
2 + 7 β
5 + 8 β
4 + 9 β
6 β‘ 196 β‘ 9 (mod 11).
We ο¬nd π₯10 by reducing the above expression modulo 11 to obtain one of the standard
representatives 0, 1, 2 . . . , 9, 10. (In the event that π₯10 = 10, the ISBN uses X instead.)
It turns out that this scheme protects both against mistyping a single digit and against
interchanging two unequal digits, as long as only one of these errors occurs in a given entry.
Theorem 3.11. If π΄ = π₯1 π₯2 . . . π₯10 is a valid ISBN and π΅ = π₯β²1 π₯β²2 . . . π₯β²10 is obtained from π΄
by altering exactly one digit or interchanging two unequal digits, then π΅ is not a valid ISBN.
Proof. Note that since 10 β‘ β1 (mod 11) our check digit test for a valid ISBN is equivalent
to the congruence
10
β
ππ₯π β‘ 0 (mod 11).
π=1
Suppose that π΅ is obtained from π΄ by replacing some digit π₯π by π₯β²π , where π₯π β= π₯β²π . Then
(β
)
10
10
β
β²
ππ₯π =
ππ₯π β ππ₯π + ππ₯β²π β‘ π(π₯β²π β π₯π ) ββ‘ 0 (mod 11)
π=1
π=1
by Euclidβs Lemma, since 11 does not divide π or π₯π β π₯β²π .
Suppose instead that π΅ is obtained from π΄ by interchanging the πth and πth digits, where
π β= π and π₯π β= π₯π . Then we can write π₯β²π = π₯π and π₯β²π = π₯π , and hence
(β
)
10
10
β
β²
ππ₯π β‘
ππ₯π + ππ₯π + ππ₯π β ππ₯π β ππ₯π β‘ (π β π)(π₯π β π₯π ) ββ‘ 0 (mod 11)
π=1
π=1
by Euclidβs Lemma, since 11 does not divide π β π or π₯π β π₯π .
β‘
Example 3.12. The code number 5-382-14572-2 was obtained from a valid ISBN by interchanging two adjacent digits. What was the original ISBN?
16
SCOTT T. PARSELL
Solution. Adopting the notation from the proof of Theorem 3.11, we have
10
β
ππ₯β²π = 5 + 6 + 24 + 8 + 5 + 24 + 35 + 56 + 18 + 20 = 201 β‘ 3 (mod 11).
π=1
Suppose the adjacent digits π₯π and π₯π+1 were interchanged in the original ISBN. Then by
applying the last displayed equation in the proof of Theorem 3.11 with π = π + 1, we see
that
π₯β²π+1 β π₯β²π = π₯π β π₯π+1 β‘ 3 (mod 11).
In the given code, we have π₯β²6 β π₯β²5 = 3, and there is no other pair of adjacent digits with
this property, so these must be the ones that were interchanged. It follows that the original
ISBN was 5-382-41572-2.
β‘
In the above example, we were able to use the ISBN scheme not only to detect an error
but also to correct it, assuming we were fairly conο¬dent that the error involved transposing
adjacent digits. Of course, if there was more than one adjacent pair (π₯β²π , π₯β²π+1 ) in the erroneous
code with π₯β²π+1 β π₯β²π = 3, then weβd be less successful.
Recently, the above system (known as ISBN-10) has been phased out in favor of a 13-digit
code that is compatible with the UPC/EAN scheme. Here the check digit is determined by
the congruence
π₯1 + 3π₯2 + π₯3 + 3π₯4 + π₯5 + β
β
β
+ 3π₯12 + π₯13 β‘ 0
(mod 10),
and a 12-digit UPC is converted to this form by putting an extra 0 at the beginning. Since
the arithmetic now occurs in β€10 , there is no need to allow X as a possible check digit. This
scheme (known as ISBN-13) still detects all single-digit errors but unfortunately no longer
detects all transpositions. Many recent books contain both the ISBN-10 and ISBN-13 codes.
Fermatβs Little Theorem. In many applications of congruences, it is important to be
able to compute powers of an integer eο¬ciently modulo some number π. In the case where
π is a prime, we have the following useful result.
Theorem 3.13. (Fermatβs Little Theorem) If π is a prime not dividing π, then
ππβ1 β‘ 1 (mod π).
Proof. Suppose that π does not divide π, and consider the product
π = π β
2π β
3π β
β
β
(π β 1)π = ππβ1 [1 β
2 β
3 β
β
β
(π β 1)] = ππβ1 (π β 1)!.
Suppose that 1 β€ π, π β€ π β 1 and that ππ β‘ ππ (mod π). Since (π, π) = 1, Lemma 3.3 implies
that π β‘ π (mod π), and hence that π = π. Therefore, the integers π, 2π, 3π, . . . , (π β 1)π
represent all the non-zero residue classes modulo π, and hence their product, π, must be
congruent modulo π to 1 β
2 β
3 β
β
β
(π β 1) = (π β 1)!. That is, we have
ππβ1 (π β 1)! β‘ (π β 1)! (mod π).
Now since all the prime factors of (π β 1)! are smaller than π, we ο¬nd that π and (π β 1)! are
relatively prime, and thus Lemma 3.3 implies that ππβ1 β‘ 1 (mod π).
β‘
MA 311
NUMBER THEORY
FALL 2008
17
We can use Fermatβs Little Theorem to compute powers modulo a prime very eο¬ciently
by applying division with remainder to the exponent. Usually we are interested in the least
non-negative representative for a particular residue class; this is sometimes called the residue
and denoted by the MOD symbol. For instance, the residue of 8 modulo 5 is 8 MOD 5 = 3.
Example 3.14. Compute 22008 MOD 13.
Solution. Since 13 is prime and doesnβt divide 2, Theorem 3.13 implies that 212 β‘ 1 (mod 13).
Moreover, division with remainder yields 2008 = 12 β
167 + 4, so
22008 = 212β
167+4 = (212 )167 β
24 β‘ 24 β‘ 3
(mod 13).
Thus we have 22008 MOD 13 = 3.
β‘
Fermatβs Little Theorem also yields a negative test for primality, which is often faster than
trial division. If π is a positive integer not divisible by π and we can show that ππβ1 ββ‘ 1
(mod π), then we may conclude that π is not prime. However, the converse of this is false.
For example, 2340 β‘ 1 (mod 341), and yet 341 = 11 β
31 is not prime. So this does not give
a way to prove that an integer is prime. Weβll return to this topic in the next section.
Reduced residues and Eulerβs Theorem. Recall that π has a multiplicative inverse
modulo π if and only if (π, π) = 1. When π is prime, the residues with this property are
just 1, 2, 3, . . . , π β 1. In general, we write π(π) for the number of positive integers less than
or equal to π that are relatively prime to π. This is known as Eulerβs phi function. For
instance, we have π(1) = 1, π(2) = 1, π(3) = 2, π(4) = 2, π(5) = 4, π(6) = 2, π(7) = 6,
π(8) = 4, π(9) = 6, and π(10) = 4. Notice that π(π) = π β 1 whenever π is prime.
The property of being relatively prime to π depends only on the residue class of an integer,
since (π, π) = (π+ππ, π) for any integer π by Lemma 2.6. Therefore, we can view π(π) as the
number of residue classes modulo π that are relatively prime to π. Any set of representatives
for these classes is called a reduced residue system modulo π. For instance, {1, 2, 3, 4} is a
reduced residue system modulo 5, while {1, 3, 7, 9} and {β3, β1, 1, 3} are reduced residue
systems modulo 10. We often use β€βπ to denote a reduced residue system modulo π. Those
familiar with abstract algebra may wish to note that β€βπ forms a group under multiplication.
The following result generalizes Fermatβs Little Theorem to the case of composite moduli.
Theorem 3.15. (Eulerβs Theorem) If π and π are positive integers with (π, π) = 1, then
ππ(π) β‘ 1 (mod π).
Proof. Let π1 , . . . , ππ(π) denote the positive integers less than or equal to π that are relatively
prime to π, and let ππ = πππ MOD π be the residue of πππ modulo π. Suppose that 1 β€
π, π β€ π(π) and ππ = ππ . Then πππ β‘ πππ (mod π), which implies that ππ β‘ ππ (mod π) since
(π, π) = 1. Since π1 , . . . , ππ(π) are distinct integers between 1 and π, we must have π = π.
This shows that π1 , . . . , ππ(π) are distinct. Moreover, it is clear that (ππ , π) = 1 for each π, so
{π1 , . . . , ππ(π) } is a reduced residue system modulo π. In particular, we have
π1 β
β
β
ππ(π) β‘ π1 β
β
β
ππ(π) β‘ ππ1 β
β
β
πππ(π) β‘ ππ(π) π1 β
β
β
ππ(π)
(mod π).
Since π1 β
β
β
ππ(π) is relatively prime to π, we conclude that ππ(π) β‘ 1 (mod π), as desired. β‘
Example 3.16. Compute 5999 MOD 12.
18
SCOTT T. PARSELL
Solution. We have π(12) = 4 and (5, 12) = 1, so Theorem 3.15 implies that 54 β‘ 1 (mod 12).
Since 999 = 4 β
249 + 3, we have
5999 = 54β
249+3 = (54 )249 β
53 β‘ 53 β‘ 5 (mod 12).
Thus we have 5999 MOD 12 = 5.
β‘
In turns out that π(π) can be computed easily provided that the prime factorization of π is
known. This follows from the following important theorem about simultaneous congruences.
We say that integers π1 , . . . , ππ are pairwise relatively prime if (ππ , ππ ) = 1 whenever π β= π.
Theorem 3.17. (Chinese Remainder Theorem) Let π1 , . . . , ππ be pairwise relatively
prime positive integers, and let π1 , . . . , ππ be any integers. There exists an integer π₯ satisfying
the system of congruences
π₯ β‘ π1 (mod π1 ),
π₯ β‘ π2 (mod π2 ),
... ,
π₯ β‘ ππ (mod ππ ),
and π₯ is unique modulo π1 β
β
β
ππ .
Proof. Let π = π1 β
β
β
ππ , and for each π write ππ = π/ππ . Since the ππ are pairwise
relatively prime, we have (ππ , ππ ) = 1, and thus Theorem 3.8 shows that there is a unique
integer π π modulo ππ satisfying the congruence
ππ π π β‘ ππ
(mod ππ ).
It is easy to check that the integer
π₯ = π1 π 1 + π2 π 2 + β
β
β
+ ππ π π
satisο¬es our system of congruences. If π₯β² is another solution to the system, then we have
π₯ β‘ π₯β² (mod ππ ) for each π, and hence π₯ β π₯β² is divisible by ππ . Since the ππ are pairwise
relatively prime, it follows easily that π₯ β π₯β² is divisible by π , which establishes uniqueness
modulo π .
β‘
Example 3.18. Solve the system of congruences
π₯ β‘ 1 (mod 5),
2π₯ β‘ 4 (mod 6),
3π₯ β‘ 2 (mod 7).
Solution. We ο¬rst rewrite the system in a form to which Theorem 3.17 applies. In view of
Theorem 3.8, we see that the system is equivalent to
π₯ β‘ 1 (mod 5),
π₯ β‘ 2 (mod 3),
π₯ β‘ 3 (mod 7),
and we may now employ the proof of Theorem 3.17 with π1 = 5, π2 = 3, and π3 = 7 to
produce a unique solution modulo π = 105. We must ο¬nd integers π 1 , π 2 , and π 3 satisfying
the congruences
21π 1 β‘ 1 (mod 5),
35π 2 β‘ 2 (mod 3),
15π 3 β‘ 3 (mod 7).
We see easily by inspection that π 1 = 1, π 2 = 1, and π 3 = 3 are solutions, and thus
π₯ = 21 β
1 + 35 β
1 + 15 β
3 = 101
is the unique solution of the original system modulo 105. Hence the solutions are precisely
the integers of the form π₯ = 101 + 105π, where π β β€.
β‘
MA 311
NUMBER THEORY
FALL 2008
19
The Chinese Remainder Theorem also allows us to deal with systems of congruences in
which the moduli are not pairwise relatively prime. The technique is to convert the system
to an equivalent one in which all the moduli are distinct prime powers.
Example 3.19. Find all solutions of the system
π₯ β‘ 1 (mod 36)
and
π₯ β‘ 5 (mod 56).
Solution. By the Chinese Remainder Theorem, the ο¬rst congruence is equivalent to the pair
π₯ β‘ 1 (mod 4)
and
π₯ β‘ 1 (mod 9),
and the second congruence is equivalent to the pair
π₯ β‘ 5 (mod 8)
and
π₯ β‘ 5 (mod 7).
The congruences modulo powers of 2 must contain either redundant or contradictory information, so we examine these more carefully. If π₯ β‘ 5 (mod 8), then we can write
π₯ = 8π + 5 = 4(2π + 1) + 1,
for some π β β€, and it follows that π₯ β‘ 1 (mod 4). Since π₯ β‘ 5 (mod 8) implies π₯ β‘ 1
(mod 4), the latter congruence is redundant and may be eliminated from consideration. We
have therefore reduced to the system
π₯ β‘ 5 (mod 8),
π₯ β‘ 1 (mod 9),
π₯ β‘ 5 (mod 7),
and here the moduli are pairwise relatively prime, so Theorem 3.17 applies. We know that
the unique solution modulo π = 504 is given by
π₯ = 63π 1 + 56π 2 + 72π 3 ,
where π 1 , π 2 , and π 3 are integers satisfying
63π 1 β‘ 5 (mod 8),
56π 2 β‘ 1 (mod 9),
72π 3 β‘ 5 (mod 7),
2π 2 β‘ 1 (mod 9),
2π 3 β‘ 5 (mod 7).
or equivalently,
7π 1 β‘ 5 (mod 8),
We see that π 1 = 3, π 2 = 5, and π 3 = 6 satisfy these congruences, and thus
π₯ = 63 β
3 + 56 β
5 + 72 β
6 = 901 β‘ 397
(mod 504)
is the unique solution modulo 504.
β‘
Example 3.20. Find all solutions of the system
π₯ β‘ 1 (mod 36)
and
π₯β‘3
(mod 56).
Solution. As in the previous example, the Chinese Remainder Theorem implies that the
system is equivalent to
π₯ β‘ 1 (mod 4),
π₯ β‘ 3 (mod 8),
π₯ β‘ 1 (mod 9),
π₯ β‘ 3 (mod 7).
But if π₯ β‘ 3 (mod 8), then we have π₯ = 8π + 3 = 4(2π) + 3 for some integer π, which shows
that π₯ β‘ 3 (mod 4). Hence these two congruences are inconsistent, and we conclude that
the system has no solution.
β‘
20
SCOTT T. PARSELL
One way of viewing the Chinese Remainder Theorem is that it gives a bijection between
the integers π₯ with 0 β€ π₯ < π and the integral π-tuples (π1 , . . . , ππ ) with 0 β€ ππ < ππ . The
correspondence is given by
π₯ Ãβ (π₯ MOD π1 , . . . , π₯ MOD ππ ).
The CRT is what allows us to recover π₯ uniquely modulo π from the numbers ππ = π₯ MOD
ππ . In fact, this yields a bijection between the π(π ) reduced residue classes modulo π and
the π(π1 ) β
β
β
π(ππ ) π-tuples of reduced residue classes modulo π1 , . . . , ππ . This observation
allows us to prove the following important multiplicative property of Eulerβs phi function.
Theorem 3.21. If (π, π) = 1, then π(ππ) = π(π)π(π).
Proof. By the Chinese Remainder Theorem, there is a one-to-one correspondence,
π₯ Ãβ (π₯ MOD π, π₯ MOD π)
between the integers π₯ with 0 β€ π₯ < ππ and the pairs (π, π) with 0 β€ π < π and 0 β€ π < π.
Now suppose that π₯ is one of the π(ππ) integers with (π₯, ππ) = 1. Then one clearly
has (π₯, π) = (π₯, π) = 1, so Lemma 2.6 implies that (π₯ MOD π, π₯ MOD π) is one of the
π(π)π(π) pairs (π, π) with (π, π) = (π, π) = 1. On the other hand, if π₯ β‘ π (mod π) and
π₯ β‘ π (mod π), where (π, π) = (π, π) = 1, then Lemma 2.6 shows that (π₯, π) = (π₯, π) = 1
and hence that (π₯, ππ) = 1. It follows that the CRT bijection specializes to a bijection
among reduced residue classes.
β‘
To help visualize the correspondence used in the proof of Theorem 3.21, we illustrate it
explicitly for the case π = 8, π = 9. In row π, column π we write the unique integer π₯ with
0 β€ π₯ < 72 that satisο¬es π₯ β‘ π (mod 8) and π₯ β‘ π (mod 9). The reduced residues modulo
8, 9, and 72 are indicated by stars, and we see that π(72) = 24 = 4 β
6 = π(8)π(9).
0
1β
2
3β
4
5β
6
7β
0
0
9
18
27
36
45
54
63
1β
64
1β
10
19β
28
37β
46
55β
2β
56
65β
2
11β
20
29β
38
47β
3
48
57
66
3
12
21
30
39
4β
40
49β
58
67β
4
13β
22
31β
5β
32
41β
50
59β
68
5β
14
23β
6
24
33
42
51
60
69
6
15
7β
16
25β
34
43β
52
61β
70
7β
8β
8
17β
26
35β
44
53β
62
71β
Corollary 3.22. If π = ππΌ1 1 β
β
β
ππΌπ π , where π1 , . . . , ππ are distinct primes, then
(
) (
)
1
1
πΌ1
πΌ1 β1
πΌπ
πΌπ β1
π(π) = (π β π
) β
β
β
(π β π
)=π 1β
β
β
β
1 β
.
π1
ππ
Proof. Applying Theorem 3.21 repeatedly gives π(π) = π(ππΌ1 1 ) β
β
β
π(ππΌπ π ). Now if π is prime,
then the only positive integers less than or equal to ππ‘ that are not relatively prime to ππ‘ are
the multiples of π, namely π, 2π, 3π, . . . , ππ‘β1 π. Since there are ππ‘β1 such multiples, we have
π(ππ‘ ) = ππ‘ β ππ‘β1 = ππ‘ (1 β 1/π),
and the result follows.
β‘
MA 311
NUMBER THEORY
FALL 2008
21
Example 3.23. Compute π(21000).
Solution. We have 21000 = 23 β
3 β
53 β
7, so Corollary 3.22 gives
π(21000) = (23 β 22 )(3 β 1)(53 β 52 )(7 β 1) = 4 β
2 β
100 β
6 = 4800.
β‘
Example 3.24. Find the last two digits of 32008 .
Solution. The last two digits are determined by the residue class modulo 100. Since 100 =
22 β
52 , we have π(100) = (4 β 2)(25 β 5) = 40 by Corollary 3.22. Moreover, one has
2008 = 40 β
50 + 8, so Eulerβs Theorem gives
32008 = (340 )50 β
38 β‘ 38 β‘ 61
(mod 100).
Therefore the last two digits are 61.
β‘
4. Public-key cryptography
We can use Eulerβs Theorem to devise a scheme for public-key encryption. In such a
system, each individual creates and publishes some unique data (known as a public key)
that allows them to receive encrypted messages from other users. The system weβll describe
was developed at MIT in 1977 by Rivest, Shamir, and Adelman and is commonly known as
RSA. To construct the code, we choose two large primes π and π, say around 200 digits each.
We then compute π = ππ and use Corollary 3.22 to calculate π(π) = (π β 1)(π β 1). Next
we choose an integer π > 1 that is relatively prime to π(π) and use the Euclidean algorithm
to ο¬nd
π = πβ1 MOD π(π).
We make the pair (π, π) publicly available but keep π secret. Obviously, we keep π and π
secret as well, since knowing them would enable one to ο¬nd π(π), and hence π. The security
of the system rests on the fact that it is essentially impossible to factor π in a reasonable
amount of time with current technology.
To encrypt a message to a user whose public key is (π, π), we ο¬rst create a digital version of
the message, say π , using some character-to-integer scheme such as ASCII. For simplicity,
we use the conversions
A = 01, B = 02, C = 03, D = 04, . . . , Y = 25, Z = 26,
so that each letter of the alphabet corresponds to a 2-digit integer, and we use 27 to represent
a space. If desired, we could introduce additional integers to stand for punctuation marks
and other symbols. If π β₯ π, we break the message into blocks so that each block is smaller
than π. We then encrypt the message by computing
πΈ = π π MOD π.
The recipient then decrypts the message by computing πΈ π MOD π, using Eulerβs Theorem
and the fact that ππ = 1 + ππ(π) for some integer π. One has
πΈ π β‘ (π π )π β‘ π ππ β‘ π 1+ππ(π) β‘ (π π(π) )π β
π β‘ π
π
(mod π),
and thus πΈ MOD π = π . In applying Eulerβs Theorem, we implicitly assumed that
(π, π) = 1, but the probability that this fails is negligible when π is composed of two 200digit primes. The point of RSA is that, given π and π, one cannot compute the decryption
key π without knowing π(π), which is equivalent to knowing the factorization π = ππ.
22
SCOTT T. PARSELL
Example 4.1. Decode the encrypted message 0828, which was generated using RSA with
public key (4897, 19).
Solution. Here the integer π = 4897 is far too small to create a secure cryptosystem, and
after some trial division we easily obtain the factorization π = ππ, where π = 59 and π = 83.
It now follows from Corollary 3.22 that π(π) = 58 β
82 = 4756. Next we calculate the
decryption key π = 19β1 MOD 4756 using the Euclidean Algorithm:
[
]
[
]
[
]
1 0 4756
1 β250 6
1 β250 6
β
β
.
0 1
19
0
1 19
β3
751 1
Hence we have π = 751, so we can decrypt the message by computing π = 828751 MOD
4897. This computation is doable on a calculator by successive squaring. We write the
exponent 751 in binary as
1011011112 = 512 + 128 + 64 + 32 + 8 + 4 + 2 + 1
and square π = 828 repeatedly modulo 4897. This gives π2 = 4, π4 = 16, π8 = 256,
π16 = 1875, π32 = β421, π64 = 949, π128 = β447, π256 = β968, and π512 = 1697, and it
follows that
π β‘ π512 π128 π64 π32 π8 π4 π2 π β‘ 2515 (mod 4897),
so the message was βYOβ.
β‘
The method of successive squaring used in the above example gives a good way of performing fast modular exponentiation, which has been implemented in many software packages.
For instance, Mathematica has the function PowerMod[a,b,n], which quickly computes ππ
MOD π. A useful tool for ο¬nding modular inverses is the function ExtendedGCD[a,b], which
returns gcd(π, π), together with integers π and π‘ satisfying gcd(π, π) = ππ + ππ‘. Mathematica
has built-in arbitrary precision, so it is great for handling long integers without the fear of
truncation. Programs that only store, say, the ο¬rst 16 digits of a number give more than sufο¬cient accuracy for many applications, but losing even a single digit of an integer is obviously
devastating for number theory.
Digital Signatures. We can also apply the RSA encryption principle to authenticate
digital signatures. If your public key is (π, π) and you send me your signature π in the form
π· = π π MOD π,
where π is your personal decryption key, then I can recover π by computing
π·π MOD π = π ππ MOD π = π 1+ππ(π) MOD π = π.
Moreover, I know that the signature is authentic since youβre the only one who knows π.
If the signature had been encrypted using an incorrect π then I would most likely obtain
gibberish when attempting to recover π.
Example 4.2. You receive the digital signature 20496 from a user with initials S. P. and
public key (21311, 41). Does it appear to be authentic?
Solution. We compute 2049641 MOD 21311 using Mathematica and get
PowerMod[20496, 41, 21311] = 1916.
MA 311
NUMBER THEORY
FALL 2008
23
Since S=19 and P=16, the message must have come from S. P., or at least someone with
access to his decryption key. Of course, the numbers are once again so small that anyone
could have ο¬gured out the decryption key and sent a phony signature.
β‘
The digital signature process described above presumes that the signer is not concerned
about the possibility of his or her signature being viewed by a third party. The goal is
simply to provide a method for the recipient to verify the signerβs identity. One can transmit
sensitive information and verify the identity of the sender by nesting an encryption within
the digital signature process. Suppose that Alice, whose public key is (ππ΄ , ππ΄ ), wants to
send a message π to Bob, whose public key is (ππ΅ , ππ΅ ), and Bob wants to be sure that the
message is really coming from Alice. Alice ο¬rst βsignsβ the message using her own decryption
key, ππ΄ , and then encrypts it using Bobβs public key. Thus she computes π = π ππ΄ MOD ππ΄
and sends Bob πΈ = π ππ΅ MOD ππ΅ . When Bob receives πΈ, he uses his decryption key ππ΅ to
compute π = πΈ ππ΅ MOD ππ΅ and then recovers the message as π = π ππ΄ MOD ππ΄ . At this
point, he knows the contents of the message and can be sure that it was sent by Alice and
not someone pretending to be Alice.
Primality testing. One issue in implementing RSA is that we need to ο¬nd large integers
that are known to be prime. Fortunately, there are alternatives to trial division for investigating primality. As mentioned in §3, Fermatβs Little Theorem can be used to show that
an integer is not prime. For example, if π is an odd prime, then the theorem tells us that
2πβ1 β‘ 1 (mod π). The converse of this statement is false; that is, there exist odd composite
integers π with the property that 2πβ1 β‘ 1 (mod π). However, it turns out that such integers are fairly rare, so there is a good chance that an integer π satisfying this congruence
will in fact be prime. An odd composite integer π satisfying this congruence is called a
pseudoprime. The only pseudoprimes less than 1000 are
341 = 11 β
31,
561 = 3 β
11 β
17,
and
645 = 3 β
5 β
43.
More generally, if π is an odd composite integer with (π, π) = 1 and ππβ1 β‘ 1 (mod π), then
we say that π is a pseudoprime for the base π.
If we want to know whether π is prime, we could ο¬rst test for divisibility by 2, 3, 5, and 7
and then compute 2πβ1 MOD π, 3πβ1 MOD π, 5πβ1 MOD π, and 7πβ1 MOD π, for instance.
If any of these is not equal to 1, then Fermatβs Theorem implies that π is not prime. If they
are all equal to 1, then π is very likely (but not certain) to be prime. Interestingly, and
perhaps unfortunately, there are odd composite integers π that are pseudoprimes for every
base π with (π, π) = 1. Such numbers are called Carmichael numbers, and the smallest one
is 561. Carmichael numbers are very sparse (561 is the only one less than 1000), but it was
proved in 1994 by Alford, Granville, and Pomerance that there are inο¬nitely many! In fact,
they showed that there are at least π₯2/7 Carmichael numbers not exceeding π₯.
The concept of pseudoprimes can be strengthened by making the following simple observation. Suppose that ππβ1 β‘ 1 (mod π). If π is odd, we can write π = 2π + 1 for some
integer π, and we see that π divides π2π β 1 = (ππ β 1)(ππ + 1). If π is prime, it now follows
from Euclidβs Lemma that π divides ππ β 1 or ππ + 1. Thus if ππ ββ‘ ±1 (mod π) then we
can conclude that π is not prime. On the other hand, if ππ β‘ 1 (mod π) and π is even, then
we can apply the same reasoning within the factorization ππ β 1 = (ππ/2 β 1)(ππ/2 + 1).
24
SCOTT T. PARSELL
Example 4.3. Show how to deduce that 341 is not prime without using its prime factorization.
Solution. One easily computes that 2340 β‘ 1 (mod 341), so 341 divides
2340 β 1 = (2170 β 1)(2170 + 1) = (285 β 1)(285 + 1)(2170 + 1).
We further compute that 2170 β‘ 1 (mod 341) and 285 β‘ 32 (mod 341). But if 341 were prime
then it would have to divide 285 β 1, 285 + 1, or 2170 + 1 by Euclidβs Lemma, which would
mean that 285 β‘ ±1 (mod 341) or 2170 β‘ β1 (mod 341). Since none of these conclusions
holds, we may conclude that 341 is not prime.
β‘
In general, if π is an odd integer exceeding 1, we can write π = 2π π‘ + 1, where π‘ is odd and
π β₯ 1. Then one has the factorization
πβ1 π‘
ππβ1 β 1 = (ππ‘ β 1)(ππ‘ + 1)(π2π‘ + 1)(π4π‘ + 1) β
β
β
(π2
+ 1).
(4.1)
π
In Example 4.3 we had π = 2 and π‘ = 85. An odd composite integer π = 2 π‘ + 1 is called
a strong pseudoprime for the base π if (π, π) = 1 and π divides one of the factors on the
right-hand side of (4.1). Any integer (prime or composite) with this property is said to have
passed the strong pseudoprime test. Strong pseudoprimes are considerably more scarce than
ordinary pseudoprimes. For the base π = 2, for example, there are 5597 pseudoprimes up to
109 but only 1282 strong pseudoprimes, the smallest of which is 2047.
Example 4.4. Show that 2047 is a strong pseudoprime for the base 2.
Solution. We observe that 2046 is not divisible by 4, so we have π = 1 and π‘ = 1023 in the
above notation. Moreover, Mathematica shows that 21023 β‘ 1 (mod 2047), so 2047 divides
21023 β 1 and hence passes the strong pseudoprime test. Finally, we note that 2047 = 23 β
89
is composite and is therefore in fact a strong pseudoprime.
β‘
The results are more striking if we apply the strong pseudoprime test for several diο¬erent
bases. There is only one integer less than 2.5 × 1010 , namely
3, 215, 031, 751 = 151 × 751 × 28351,
that is a strong pseudoprime for bases 2, 3, 5, and 7. Moreover, there is no βstrongβ analogue
of the Carmichael numbers. That is, every composite number π fails the strong pseudoprime
test for some base π with (π, π) = 1. Such a π is called a witness to the compositeness of π.
In fact, it can be shown that at least half of the bases π β€ π with (π, π) = 1 are witnesses
when π is composite, so this procedure can be used to identify primes with near certainty.
In 2004, Agrawal, Kayal, and Saxena developed an algorithm that proves the primality or
compositeness of π with
β a running time that is polynomial in log π. For comparison, trial
division requires π( π) steps to prove that an integer is prime. Many software packages
have built-in functions that implement various primality tests. In Mathematica, for example,
PrimeQ[n] returns true or false according to whether or not π is prime, while Prime[k]
returns the πth prime number.
Factorization algorithms. Attacks on RSA could be made if eο¬cient factoring algorithms were known. As with primality testing, there are algorithms that are far more
MA 311
NUMBER THEORY
FALL 2008
25
eο¬cient than trial division, but no current algorithm comes close to breaking RSA with
200-digit primes, except in very special cases that can easily be avoided. We brieο¬y explore
some of the ideas involved in these factorization techniques.
Fermatβs factoring method was to try to express π as the diο¬erence of two squares. If we
can ο¬nd positive integers π₯ and π¦ such that
π = π₯2 β π¦ 2 = (π₯ β π¦)(π₯ + π¦),
then weβve found a factorization of π, provided that π₯ β π¦ β= 1. Kraitchik realized that one
could apply the spirit of Fermatβs idea more eο¬ciently by instead looking for integers π₯ and
π¦ satisfying the weaker condition
π₯2 β‘ π¦ 2
(mod π),
so that π divides (π₯ β π¦)(π₯ + π¦). This no longer ensures a factorization of π, but there is a
reasonable chance that both π₯ β π¦ and π₯ + π¦ contain some of the prime factors of π. For
example, if π is the product of two distinct primes π and π, one would expect roughly a 50%
chance that π and π split among the two factors π₯βπ¦ and π₯+π¦. In this case gcd(π₯βπ¦, π) will
be a non-trivial factor of π, and this can be computed eο¬ciently via the Euclidean algorithm.
If both π and π divide the same factor, then one can simply try diο¬erent values for π₯ and
π¦. Powerful recent factoring methods like the quadratic sieve are based on ο¬nding suitable
integers π₯ and π¦ to carry out this principle.
Pollardβs so-called βrhoβ method is based on generating a quasi-random sequence of numbers that are distinct modulo the integer π to be factored but not distinct modulo its smallest
prime divisor π. Suppose we generate βrandomβ integers π₯1 , . . . , π₯π , where π is large by comβ
parison with π but small by comparison with π. For example, we could take π β 10π1/4 .
Then the probability that the π₯π are distinct modulo π is very small, so gcd(π₯π β π₯π , π)
will most likely produce the factor π for some π and π. This leads to a factorization of π
with expected running time π(π1/4 ). When the method works, the numbers π₯π MOD π are
eventually periodic and can thus be written in a shape resembling the Greek letter π.
Example 4.5. Use Pollardβs rho method to factor the integer π = 36287.
Solution. First note that 236286 β‘ 35799 ββ‘ 1 (mod 36287), so Fermatβs Little Theorem
implies that π is composite. We construct our quasi-random sequence of integers recursively
by taking π₯0 = 1 and π₯π+1 = (π₯2π + 1) MOD π. The ο¬rst few terms of the sequence are
1, 2, 5, 26, 677, 22886, 2439, 33941, 24380, 3341, 22173, 25654, 26685.
In particular, one has π₯5 = 22886 and π₯12 = 26685, which gives gcd(π₯12 β π₯5 , π) = 131. Thus
we obtain the factorization 36287 = 131 β
277.
β‘
Suppose that π is a large composite integer with no small prime factors but that all the
prime factors of π β 1 are small for some prime πβ£π. For example, suppose that π β 1 divides
10000!. Then by Fermatβs Little Theorem one has 210000! β‘ 1 (mod π), and thus π divides
gcd(210000! β 1, π). Thus we can attempt to ο¬nd π by computing gcd(2π! β 1, π) for various
values of π. This is known as Pollardβs π β 1 method, and it can be applied with bases other
than 2 as well.
26
SCOTT T. PARSELL
Example 4.6. Use Pollardβs π β 1 method to factor the integer π = 69841.
Solution. We have 2πβ1 β‘ 37073 ββ‘ 1 (mod π), so π is composite. With π = 5 in the above
notation, we obtain gcd(2120 β 1, 69841) = 331, which gives us a nontrivial divisor of π. It
is now easily checked that π = 211 β
331 is the desired prime factorization. Note that the
method was eο¬ective here because π β 1 = 330 = 2 β
3 β
5 β
11 is divisible by 11!. One would
typically expect to test up to π = 11 before ο¬nding π, but we happened to ο¬nd it sooner. β‘
One important consequence of the Pollard π β 1 method is that RSA can be broken if the
primes π and π are chosen in such a way that π β 1 or π β 1 has only small prime factors.
Therefore, one must be careful to avoid this situation when constructing a public key. We
also mention that neither of Pollardβs algorithms will prove primality if they are applied to
prime integers. Hence they should only be used on integers that are known to be composite,
for example by failing a pseudoprime test. The Mathematica function FactorInteger[n]
implements a variety of advanced algorithms to attempt to determine the prime factorization
of π, but it typically becomes extremely slow when the smallest prime factor of π is large.
5. Primitive roots
When (π, π) = 1, we know from Eulerβs Theorem that ππ(π) β‘ 1 (mod π). However, there
may be smaller powers of π that are congruent to 1 modulo π. We deο¬ne the order of π
modulo π (or the order of π in β€βπ ) to be the smallest positive integer π such that
ππ β‘ 1 (mod π).
For example, the elements 3, 5, and 7 all have order 2 in β€β8 . The elements 2 and 3 have
orders 3 and 6, respectively, in β€β7 . In view of Eulerβs Theorem, the order of an element in
β€βπ is at most π(π).
Note that if ππ β‘ 1 (mod π) for some positive integer π, then we can write ππβ1 π + ππ = 1
for some π β β€, so Corollary 2.5 implies that (π, π) = 1. Thus if (π, π) > 1 then there is no
positive power of π that is 1 modulo π. Hence order is not deο¬ned for the elements of β€π
that are not relatively prime to π.
Theorem 5.1. If π has order π in β€βπ and π is a positive integer with ππ β‘ 1 (mod π),
then π divides π.
Proof. We use division with remainder (Theorem 2.3) to write π = ππ + π, where π and π
are integers with 0 β€ π < π. Then we have
1 β‘ ππ β‘ πππ+π β‘ (ππ )π β
ππ β‘ ππ
(mod π),
so the minimality of π implies that π = 0, and hence π = ππ, as required.
β‘
Corollary 5.2. The order of every element of β€βπ divides π(π).
Proof. In view of Eulerβs Theorem, this follows from Theorem 5.1 with π = π(π).
β‘
For example, it is easy to check directly that each element of β€β7 has order 1, 2, 3, or 6
and that each element of β€β8 has order 1 or 2. If the order of π modulo π happens to be π(π)
then we say that π is a primitive root modulo π.
MA 311
NUMBER THEORY
FALL 2008
27
Example 5.3. Determine the primitive roots modulo 7 and modulo 8.
Solution. The elements 3 and 5 are primitive roots modulo 7 because they both have order
6 = π(7). It is easily checked that all other elements of β€β7 have order less than 6, so there
are no other primitive roots. Finally, there are no primitive roots modulo 8 because there
are no elements of order 4 = π(8).
β‘
A primitive root is sometimes a called a generator because computing successive powers
of it generates the whole of β€βπ . For example, 3 is a generator for β€β7 because
31 = 3,
32 = 2,
33 = 6,
34 = 4,
35 = 5,
and 36 = 1.
For this reason, we sometimes use the letter π to denote a primitive root. In algebraic
terms, the existence of a primitive root modulo π means that β€βπ is a cyclic group under
multiplication. Example 5.3 shows that β€β7 is cyclic but that β€β8 is not. The following
theorem shows that primitive roots are generators.
Theorem 5.4. If π is a primitive root modulo π and (π, π) = 1, then we have π β‘ π π (mod π)
for some integer π with 1 β€ π β€ π(π).
Proof. Consider the π(π) integers π, π 2 , π 3 , . . . , π π(π) . If π π β‘ π π (mod π) for some π and π
with 1 β€ π < π β€ π(π), then we would have π πβπ β‘ 1 (mod π), which is impossible since
π has order π(π) and 0 < π β π < π(π). Therefore, the integers π, π 2 , π 3 , . . . , π π(π) all lie in
distinct residue classes modulo π. Since each π π is also relatively prime to π, we deduce that
the set {π, π 2 , π 3 , . . . , π π(π) } forms a reduced residue system modulo π. Hence there is some
exponent π for which π β‘ π π (mod π).
β‘
Theorem 5.5. If π has order π modulo π, then ππ has order π/(π, π) modulo π.
Proof. Let π denote the order of ππ modulo π. First of all, we have
(ππ )π/(π,π) β‘ (ππ )π/(π,π) β‘ 1
(mod π),
so Theorem 5.1 implies that π divides π/(π, π). Moreover, we have
πππ β‘ (ππ )π β‘ 1 (mod π),
so Theorem 5.1 further implies that π divides ππ, and hence that π/(π, π) divides ππ/(π, π).
Since π/(π, π) and π/(π, π) are relatively prime, it follows from a homework exercise that
π/(π, π) divides π. Since π divides π/(π, π) and π/(π, π) divides π, and both quantities are
positive, we may conclude that π = π/(π, π), as desired.
β‘
Corollary 5.6. If β€βπ contains a primitive root, then the total number of primitive roots in
β€βπ is π(π(π)). In other words, if β€βπ is cyclic, then it has π(π(π)) generators.
Proof. If π is a primitive root modulo π, then Theorem 5.5 shows that π π is a primitive root
if and only if (π(π), π) = 1. Hence there are π(π(π)) choices for π. By Theorem 5.4, all
β‘
elements of β€βπ can be expressed as π π for some π, so this completes the proof.
The following theorem, due to Gauss, completely characterizes the integers π for which
has a primitive root.
β€βπ
28
SCOTT T. PARSELL
Theorem 5.7. There exists a primitive root modulo π if and only if π = 1, 2, 4, ππ , or 2ππ ,
where π is an odd prime and π is a positive integer.
Example 5.8. What can you say above the existence of primitive roots modulo π when
9 β€ π β€ 20? How many primitive roots are there modulo 18 and 19?
Solution. In view of Theorem 5.7, there are primitive roots modulo 9, 10, 11, 13, 14, 17, 18,
and 19, while there are no primitive roots modulo 12, 15, 16, or 20. By Corollary 5.6, the
number of primitive roots modulo 18 is π(π(18)) = π(6) = 2, and the number of primitive
roots modulo 19 is π(π(19)) = π(18) = 6.
β‘
The full proof of Theorem 5.7 is somewhat time-consuming, although it is accessible with
elementary techniques. Rather than giving the complete argument, which involves a number
of separate cases, we will be content to prove the existence of primitive roots for prime
moduli. Before doing this, we need some auxiliary results. The following theorem, due to
Lagrange, concerns solutions of polynomial congruences modulo a prime.
Theorem 5.9. Let π (π₯) be a polynomial of degree π with integer coeο¬cients, and let π be a
prime not dividing the leading coeο¬cient of π (π₯). Then the congruence π (π₯) β‘ 0 (mod π)
has at most π distinct solutions modulo π.
Proof. We proceed by induction on π. When π = 0, the polynomial π (π₯) is a constant not
divisible by π, so the congruence has no solutions. Now suppose that π > 0 and that the
result holds for all polynomials of degree less than π. Let π (π₯) be a polynomial of degree
π, with π not dividing the leading coeο¬cient, and suppose that π (π) β‘ 0 (mod π). Using
division with remainder for polynomials, we can write
π (π₯) = π(π₯)(π₯ β π) + π,
where π(π₯) is a polynomial of degree π β 1, and where π is an integer. (Since π₯ β π has degree
one, the remainder has degree zero.) Moreover, π does not divide the leading coeο¬cient of
π(π₯), since this is the same as the leading coeο¬cient of π (π₯). We have
π = π (π) β‘ 0 (mod π),
which means that πβ£π, and thus for any integer π₯ we have
π (π₯) β‘ π(π₯)(π₯ β π) (mod π).
Now if π (π) β‘ 0 (mod π), then π divides π(π)(π β π), so Euclidβs Lemma implies that π
divides π(π) or π divides π β π. In the ο¬rst case, we have π(π) β‘ 0 (mod π), so the induction
hypothesis ensures that there are at most π β 1 choices for π. In the second case, we have
π β‘ π (mod π), which gives one additional possibility. Thus π (π₯) β‘ 0 (mod π) has at most
π solutions in total.
β‘
Note that the theorem fails for composite moduli. For example, the congruence π₯2 β 1 β‘ 0
(mod 8) has four solutions, π₯ = 1, 3, 5, 7, but the polynomial π (π₯) = π₯2 β 1 has degree two.
The next lemma establishes an interesting relationship between an integer and the Euler
phi function of its divisors. To illustrate, notice that the positive divisors of 12 are 1, 2, 3,
MA 311
NUMBER THEORY
FALL 2008
29
4, 6, and 12 and that
π(1) + π(2) + π(3) + π(4) + π(6) + π(12) = 1 + 1 + 2 + 2 + 2 + 4 = 12.
The positive divisors of 17 are 1 and 17, and we have π(1) + π(17) = 1 + 16 = 17. The
positive divisors of 20 are 1, 2, 4, 5, 10, and 20, and we have
π(1) + π(2) + π(4) + π(5) + π(10) + π(20) = 1 + 1 + 2 + 4 + 4 + 8 = 20.
We now show that this phenomenon occurs in general.
Lemma 5.10. Let π be a positive integer, and let π1 , π2 , . . . , ππ‘ denote the positive divisors
of π. Then
β
π(π1 ) + π(π2 ) + β
β
β
+ π(ππ‘ ) =
π(π) = π.
πβ£π
Proof. Let π(π) denote the number of integers π with 1 β€ π β€ π and (π, π) = π. Since
π(π) = 0 unless π is a divisor of π, we can write
β
π(π1 ) + π(π2 ) + β
β
β
+ π(ππ‘ ) =
π(π) = π.
πβ£π
Consider an integer π counted by π(π). Then (π, π) = π, so in particular we have πβ£π and
πβ£π, and furthermore (π/π, π/π) = 1. Thus we can write π = ππ for a unique integer π with
1 β€ π β€ π/π and (π, π/π) = 1. Hence the number of choices for π is π(π/π). Since this
also gives the number of possibilities for π, we deduce that π(π) = π(π/π). Notice that the
numbers π/π1 , π/π2 , . . . , π/ππ‘ are just the divisors π1 , π2 , . . . , ππ‘ , listed in a diο¬erent order.
Thus we have
β
β
β
π(π) = π,
π(π/π) =
π(π) =
πβ£π
πβ£π
πβ£π
as desired.
β‘
We can now demonstrate the existence of primitive roots modulo a prime π. The following
theorem actually makes the stronger assertion that β€βπ contains elements of all orders dividing
π β 1 (including π β 1 itself). Note that Corollary 5.2 implies that no other orders are
permissible.
Theorem 5.11. If π is prime and π is a positive integer dividing πβ1, then there are exactly
π(π) elements of order π in β€βπ .
Proof. Let π be a divisor of π β 1, and let π (π) be the number of elements of order π in
β€βπ . If π (π) > 0, then there exists some π β β€βπ of order π. The integers π, π2 , π3 , . . . , ππ are
distinct modulo π, since otherwise we would have ππβπ β‘ 1 (mod π), where 0 < π β π < π,
violating the deο¬nition of order. Moreover, for each π with 1 β€ π β€ π we have
(ππ )π β‘ (ππ )π β‘ 1 (mod π),
so each ππ is a solution of the congruence π₯π β 1 β‘ 0 (mod π), and Theorem 5.9 implies that
these are the only solutions. Furthermore, every element of order π satisο¬es the congruence
and must therefore be a power of π. We know from Theorem 5.5 that ππ has order π if and
only if (π, π) = 1, so there are exactly π(π) elements of order π. Thus weβve shown that either
30
SCOTT T. PARSELL
π (π) = 0 or π (π) = π(π) whenever πβ£(π β 1). Since there are π β 1 elements in β€βπ , we deduce
from Lemma 5.10 that
β
β
π (π) = π β 1 =
π(π).
πβ£(πβ1)
πβ£(πβ1)
Since π (π) β€ π(π) for each π, we must actually have π (π) = π(π) for each π, and this
completes the proof.
β‘
Corollary 5.12. There are exactly π(π β 1) primitive roots in β€βπ .
Proof. This follows immediately by taking π = π β 1 in Theorem 5.11.
β‘
The Lucas primality test. Suppose the integer π has passed a strong pseudoprime test
and is therefore suspected to be prime. It turns out that we can then use primitive roots to
try to prove that π is prime. Suppose that π passes the ordinary pseudoprime test for the
base π, so that
ππβ1 β‘ 1 (mod π),
and further that we are able to factor π β 1, say π β 1 = ππ11 β
β
β
ππππ . Theorem 5.1 implies
that the order of π modulo π divides π β 1, so if we can show that
π(πβ1)/ππ ββ‘ 1
(mod π)
for each π then we may conclude that the order of π is actually π β 1. On the other hand, we
know from Eulerβs Theorem that the order of π cannot exceed π(π), so we have π β 1 β€ π(π).
But it follows easily from Corollary 3.22 that this can only happen if π is prime, in which
case π(π) = π β 1.
Example 5.13. Use the Lucas test to prove that π = 631 is prime.
Solution. We have 3630 β‘ 1 (mod 631), so the order of 3 modulo 631 must divide 630.
Moreover we have 630 = 2 β
32 β
5 β
7 and
3(πβ1)/2 = 3315 β‘ β1 (mod 631),
(πβ1)/5
3
=3
126
β‘ 242 (mod 631),
3(πβ1)/3 = 3210 β‘ β44
3
(πβ1)/7
=3
90
β‘ 269
(mod 631),
(mod 631),
which shows that the order of 3 modulo 631 is actually equal to 630. We may therefore
conclude that 631 is prime and that 3 is a primitive root modulo 631.
β‘
If we ο¬nd an element π of order π β 1 in β€βπ , then the above argument shows that π is
prime and hence that π is a primitive root modulo π. So the success of the test depends
in part on being able to ο¬nd primitive roots quickly. However, Corollary 5.12 implies that
there are π(π β 1) primitive roots in β€βπ when π is prime, and a bit of elementary analytic
number theory shows that π(π) β π62 π on average. Hence the proportion of numbers π β€ π
that are primitive roots modulo π averages about 6/π 2 β 0.608 for large prime values of π.
Thus we have a good chance of ο¬nding a suitable π fairly quickly if π is in fact prime.
A more serious issue is that it may not be easy to factor π β 1. If weβre lucky, it will have
several relatively small prime factors, but there might be a large factor remaining whose
primality needs to be established. In this case, we can iterate the Lucas test until our
numbers π β 1 are small enough to be factored by trial division.
MA 311
NUMBER THEORY
FALL 2008
31
The Diο¬e-Hellman key exchange. The ο¬rst secure method for public-key cryptography was actually developed about two years before the RSA breakthrough. One of the
fundamental problems with classical cryptography is the diο¬culty of agreeing on the key for
a particular cipher without having this information intercepted. Diο¬e and Hellman resolved
this by generating large prime π and then choosing a primitive root π modulo π. Note that
Corollary 5.12 ensures that there are π(π β 1) possible choices for π . The pair (π, π ) is public information. Now if Alice wants to communicate with Bob, she chooses a number π at
random between 2 and π β 2 and sends him πΌ = π π MOD π. Bob then chooses a number π
at random between 2 and π β 2 and sends Alice π½ = π π MOD π. Since π is a primitive root
modulo π, we know that neither πΌ nor π½ will equal 1. We observe that
π = π ππ MOD π = π½ π MOD π = πΌπ MOD π
can now be calculated by both Alice and Bob and can be used as the key for whatever
cryptosystem they employ.
Example 5.14. Using the public prime π = 197 and public base π = 31, show how Alice and
Bob can agree on a common key for secure communication.
Solution. Suppose that Alice randomly chooses π = 72. Then she sends
πΌ = 3172 MOD 197 = 76
to Bob. If Bob randomly chooses π = 109, then he sends
π½ = 31109 MOD 197 = 147
to Alice. At this point, Alice computes
π½ 72 MOD 197 = 14772 MOD 197 = 28
and Bob computes
πΌ109 MOD 197 = 76109 MOD 197 = 28,
so theyβve agreed on the key π = 28.
β‘
The natural way for Eve (who is eavesdropping) to obtain the key π would be to solve the
two congruences
π π β‘ πΌ (mod π) and π π β‘ π½ (mod π)
(5.1)
for π and π. Solving either one of these is known as the discrete log problem, and there is
no known eο¬cient algorithm for handling it. It is believed that no such algorithm exists,
but this has not been proven. What Eve really needs is an eο¬cient algorithm for ο¬nding π ππ
MOD π from π π and π π , which is known as the Diο¬e-Hellman problem. Its solution would
obviously follow from a solution to the discrete log problem, but itβs not known whether the
two problems are equivalent. In view of Theorem 5.4, the fact that π is a primitive root
modulo π ensures that the congruences (5.1) have unique solutions π and π between 1 and
π β 1 for every choice of πΌ and π½ between 1 and π β 1. The uniqueness of the solutions
obviously makes it less likely that Eve will stumble upon one quickly by trial and error.
The ElGamal Cryptosystem. One disadvantage of the Diο¬e-Hellman method is that
Alice has to wait for a response from Bob before she can calculate the key and initiate secure
communication. However, ElGamal showed that the protocol can be adapted to create a selfcontained public-key cryptosystem. In addition to the public prime π and base π , suppose
32
SCOTT T. PARSELL
that Alice and Bob publish their numbers πΌ and π½ in a directory. If Alice wants to send a
message π₯ to Bob, she generates a random session key π between 2 and π β 2 and sends Bob
π‘ = π π MOD π and π¦ = π½ π π₯ MOD π.
He then recovers the message by computing
π¦(π‘π )β1 MOD π = (π₯π ππ ) β
(π ππ )β1 = π₯,
provided that π₯ β€ π β 1. Longer messages can of course be broken into blocks prior to
encryption.
Example 5.15. Suppose that Alice and Bob use ElGamal with public prime π = 11881379
and base π = 23, and that Alice has published πΌ = 10442571. How can Bob discreetly ask
Alice to tea?
Solution. Bob ο¬rst needs to pick a random session key, say π = 101. He then converts TEA
to digital form, say π₯ = 200501, and calculates
π‘ = 23101 MOD π = 3054634 and π¦ = 200501 β
10442571101 MOD π = 3497868.
Bob now sends the pair (3054634, 3497868) to Alice, and she recovers the message using her
private key, π = 8137:
3497868 β
(30546348137 )β1 MOD π = 3497868 β
7225717 MOD π = 200501.
β‘
6. Quadratic reciprocity
Quadratic Residues. Having studied linear congruences in §3, it is natural to ask about
solving quadratic congruences. Let π be a prime and let π β β€βπ . We say that π is a quadratic
residue modulo π if there exists π₯ such that π₯2 β‘ π (mod π). If no such π₯ exists, then π is
called a quadratic non-residue modulo π. We sometimes denote the sets of quadratic residues
and non-residues in β€βπ by π
and π , respectively.
Example 6.1. Identify the quadratic residues and non-residues in β€β5 , β€β7 , and β€β11 .
Solution. In β€β5 , we have π
= {1, 4} and π = {2, 3}. In β€β7 , we have π
= {1, 2, 4} and
π = {3, 5, 6}. In β€β11 , we have π
= {1, 3, 4, 5, 9} and π = {2, 6, 7, 8, 10}.
β‘
The next theorem shows that there are always equal numbers of quadratic residues and
non-residues modulo an odd prime.
Theorem 6.2. If π is an odd prime, then β£π
β£ = β£π β£ = 12 (π β 1).
Proof. If π₯2 β‘ π¦ 2 (mod π), then π divides π₯2 βπ¦ 2 = (π₯βπ¦)(π₯+π¦), so Euclidβs Lemma implies
that π divides π₯ β π¦ or π₯ + π¦, and thus π₯ β‘ ±π¦ (mod π). Thus every quadratic residue in β€βπ
has exactly two distinct square roots, which implies that the set π
= {π₯2 : π₯ β β€βπ } contains
β‘
exactly half the elements of β€βπ .
How do we determine whether a particular element of β€βπ is a quadratic residue? One
answer is given by the following theorem.
MA 311
NUMBER THEORY
FALL 2008
33
Theorem 6.3. (Eulerβs Criterion) Let π be an odd prime, and let π β β€βπ . Then π is a
quadratic residue modulo π if and only if π(πβ1)/2 β‘ 1 (mod π).
Proof. By Theorem 5.11, we know that there exists a primitive root π modulo π. If π is a
quadratic residue modulo π, then there exists π₯ β β€βπ with π₯2 β‘ π (mod π). By Theorem
5.4, we have π₯ β‘ π π (mod π) for some integer π, and thus π β‘ π 2π (mod π). It follows that
π(πβ1)/2 β‘ (π 2π )(πβ1)/2 β‘ (π πβ1 )π β‘ 1 (mod π).
Conversely, if π is a quadratic non-residue modulo π, then π cannot be an even power of π,
so Theorem 5.4 implies that π β‘ π 2π+1 (mod π) for some integer π. Thus we have
π(πβ1)/2 β‘ (π 2π+1 )(πβ1)/2 β‘ (π πβ1 )π π (πβ1)/2 β‘ π (πβ1)/2 ββ‘ 1
(mod π),
since π has order π β 1. In fact, we can deduce that π(πβ1)/2 β‘ β1 (mod π), since π(πβ1)/2 is
a solution of the congruence π₯2 β‘ 1 (mod π).
β‘
As a result, the congruence π₯2 β‘ π (mod π) has two solutions if π(πβ1)/2 β‘ 1 (mod π) and
no solutions if π(πβ1)/2 β‘ β1 (mod π).
Example 6.4. Decide whether the congruence π₯2 β‘ 6 (mod 37) has solutions.
Solution. We have
618 β‘ (62 )9 β‘ (β1)9 β‘ β1 (mod 37),
so Eulerβs Criterion implies that the congruence has no solution.
β‘
( )
When π is an integer and π is an odd prime,
β§
( ) 
β¨ 0
π
=
1

π
β©β1
we deο¬ne the Legendre symbol
π
π
by
if πβ£π
if π β π
.
if π β π
Note that this deο¬nition only depends on the residue class( of) π modulo
( 3 ) π, so replacing
( 7 ) π by
2
π + ππ does not change the value. For example, we have 7 = 1, 7 = β1, and 7 = 0.
The Legendre symbol (sometimes read as βπ on πβ) has the following useful properties:
Theorem 6.5. Let π and π be integers, and let π be an odd prime. Then
( )
( ) ( )( )
π
ππ
π
π
(πβ1)/2
(i)
β‘π
(mod π)
(iii)
=
π
π
π
π
( )
( 2)
β1
π
(ii)
= (β1)(πβ1)/2
(iv)
= 1 if π is not divisible by π.
π
π
Proof. Fermatβs Little Theorem gives ππβ1 β‘ 1 (mod π) when (π, π) = 1, so property (i)
follows immediately from Eulerβs Criterion and Euclidβs Lemma. Properties (ii), (iii), and
(iv) follow easily from (i).
β‘
Note that property (ii) implies that β1 is a quadratic residue mod π if and only if π β‘ 1
(mod 4). For example, the congruence π₯2 β‘ β1 is solvable modulo 73 but not modulo 71.
34
SCOTT T. PARSELL
The following criterion is a key ingredient in proving the law of quadratic reciprocity,
which provides an eο¬cient method for computing the Legendre symbol.
Theorem 6.6. (Gaussβ Criterion) Let π be an odd prime, and let π be a positive integer
not divisible by π. For 1 β€ π β€ 21 (π β 1), let ππ = π(2π β 1) MOD π, and let π‘ be the number
of ππ that are even. Then we have
( )
π
= (β1)π‘ .
π
Example 6.7. Use Gaussβ criterion to calculate
(2) ( 2 ) ( 2 )
(2)
, 11 , 13 , and 17
.
7
Solution. For π = 7, we have π1 = 2 β
1 = 2, π2 = 2 β
3 = 6, and
( )π3 = 2 β
5 = 3. Hence the
number of even residues is π‘ = 2, and Gaussβ Criterion gives 72 = (β1)2 = 1. Similarly,
have π1 = 2, π2 = 6, π3 = 10, π4 = 3, and π5 = 7, which yields π‘ = 3 and
(for2 )π = 11 we
3
=
(β1)
=
β1. For π =( 13,) we get π1 = 2, π2 = 6, π3 = 10, π4 = 1, π5 = 5, and π6 = 9,
11
2
so we again have π‘ = 3 and 13
= β1. Finally, for π = 17 we have(π1 )= 2, π2 = 6, π3 = 10,
2
π4 = 14, π5 = 1, π6 = 5, π7 = 9, and π8 = 13, which gives π‘ = 4 and 17
= 1.
β‘
The result of Example 6.7 may be generalized as follows.
( )
2
2
Corollary 6.8. If π is an odd prime, then
= (β1)(π β1)/8 .
π
Proof. It is an easy exercise to check that (π2 β 1)/8 is even if π β‘ ±1 (mod 8) and odd if
π β‘ ±3 (mod 8). The proof therefore splits into four cases. First of all, suppose that π β‘ 1
(mod 8) so that π = 8π + 1 for some positive integer π. The numbers
2 β
1, 2 β
3, 2 β
5, . . . , 2(4π β 1)
are all less than π (since 8π β 2 < 8π + 1), so their residues are all clearly even. On the other
hand, the numbers
2(4π + 1), 2(4π + 3), 2(4π + 5), . . . , 2(8π β 1)
all lie between π and 2π, so their residues are 1, 5, 9, . . . , 8π β 3, which are all odd. The
number of even residues in Gaussβ criterion is therefore π‘ = 2π, since 2π β 1 ranges from 1
to 4π β 1 as π ranges from 1 to 2π, and thus (2/π) = (β1)2π = 1. Next suppose that π β‘ 3
(mod 8), so that π = 8π + 3. Then the numbers
2 β
1, 2 β
3, 2 β
5, . . . , 2(4π + 1)
are all less than π (since 8π + 2 < 8π + 3), so their residues are even. The numbers
2(4π + 3), 2(4π + 5), 2(4π + 7), . . . , 2(8π + 1)
all lie between π and 2π, so their residues are 3, 7, 11, . . . , 8π β 1, which are all odd. We
therefore have π‘ = 2π + 1 and hence (2/π) = (β1)2π+1 = β1. The remaining two cases are
left as exercises.
β‘
Proof of Gaussβ criterion: Write π = 21 (π β 1). We re-index the residues so that
π1 , π2 , . . . , ππ‘ are even and ππ‘+1 , ππ‘+2 , . . . , ππ are odd. Let π1 , π2 , . . . , ππ be the positive odd
MA 311
NUMBER THEORY
FALL 2008
35
integers less than π, re-ordered so that ππ = πππ MOD π. The numbers
π β π1 , π β π2 , . . . , π β ππ‘ , ππ‘+1 , ππ‘+2 , . . . , ππ
are positive odd integers less than π; we claim that they are distinct and hence a re-ordering
of π1 , . . . , ππ . To show this, we consider three cases:
(i) If ππ = ππ , where π‘ + 1 β€ π, π β€ π, then πππ β‘ πππ (mod π), so Lemma 3.3 gives ππ β‘ ππ
(mod π). But π1 , . . . , ππ are distinct positive integers less than π, so we deduce that π = π.
(ii) If π β ππ = π β ππ , where 1 β€ π, π β€ π‘, then ππ = ππ , so the above argument gives π = π.
(iii) If π β ππ = ππ , where 1 β€ π β€ π‘ and π‘ + 1 β€ π β€ π, then ππ + ππ β‘ 0 (mod π), so
π(ππ + ππ ) β‘ 0 (mod π), and thus ππ + ππ β‘ 0 (mod π). Since 0 < ππ + ππ < 2π, it follows that
ππ + ππ = π, which is impossible since ππ + ππ is even.
We therefore have
π1 β
β
β
ππ β‘ (π β π1 ) β
β
β
(π β ππ‘ )ππ‘+1 β
β
β
ππ β‘ (β1)π‘ π1 β
β
β
ππ β‘ (β1)π‘ ππ (π1 β
β
β
ππ ) (mod π).
Since π does not divide π1 β
β
β
ππ , we deduce that ππ β‘ (β1)π‘ (mod π), and the result now
follows from part (i) of Theorem 6.5. β‘
We are now ready to state the main theorem of this section, which is one of the most
important and beautiful results in elementary number theory.
Theorem 6.9. (Quadratic Reciprocity) If π and π are distinct odd primes, then
{
( )( )
(πβ1)(πβ1)
π
π
1 if π β‘ 1 (mod 4) or π β‘ 1 (mod 4)
4
= (β1)
=
π
π
β1 if π β‘ π β‘ 3 (mod 4).
The proof of Theorem 6.9 uses Gaussβ criterion but requires a somewhat technical argument to count the even residues π(2π β 1) MOD π and π(2π β 1) MOD π. There are actually
many ways of proving quadratic reciprocity; over 200 diο¬erent proofs have appeared in print
since Gaussβ original work in the early 1800s. Before launching into a proof, we illustrate
with some typical applications.
( )
11
Example 6.10. Use quadratic reciprocity to calculate
.
31
Solution. Since 11 and 31 are both primes congruent to 3 mod 4, quadratic
reciprocity gives
( 11 )
( 31 )
( 31 ) ( 9 ) ( 32 )
= β 11 . Now since 31 β‘ 9 (mod 11), we have 11 = 11 = 11 = 1. We therefore
31
( 11 )
conclude that 31
= β1 and hence that 11 is a quadratic non-residue modulo 31.
β‘
(
Example 6.11. Use quadratic reciprocity to calculate
)
42
.
61
Solution. We ο¬rst apply Theorem 6.5 (iii) to write
( ) ( )( )( )
42
2
3
7
=
.
61
61
61
61
36
SCOTT T. PARSELL
(2)
Since 61 β‘ 5 (mod 8), Corollary 6.8 gives 61
= β1. Next, since 61 β‘ 1 (mod 4), we may
apply quadratic reciprocity to obtain
( ) ( ) ( )
( ) ( ) ( )
61
1
7
61
5
3
=
=
= 1 and
=
=
= β1,
61
3
3
61
7
7
( 42 )
by the result of Example 6.1. Thus we conclude that 61
= (β1) β
(1) β
(β1) = 1, and hence
that 42 is a quadratic residue modulo 61.
β‘
Quadratic reciprocity can be used to determine a general criterion for 3 to be a quadratic
residue modulo a prime π > 3. The result is somewhat reminiscent of the analogous criterion
for (2/π) given in Corollary 6.8, except that here the conclusion depends on the residue class
of π modulo 12 rather than modulo 8.
( ) {
3
1 if π β‘ ±1 (mod 12)
Corollary 6.12. One has
=
.
π
β1 if π β‘ ±5 (mod 12)
Proof. If π β‘ 1 (mod 12), then π β‘ 1 (mod 3) and π β‘ 1 (mod 4), so quadratic reciprocity
gives
( ) ( ) ( )
3
π
1
=
=
= 1.
π
3
3
Similarly, if π β‘ β1 (mod 12), then π β‘ 2 (mod 3) and π β‘ 3 (mod 4), so quadratic
reciprocity yields
( )
( )
( )
3
π
2
=β
=β
= β(β1) = 1.
π
3
3
We leave the remaining two cases as exercises.
β‘
A proof of quadratic reciprocity. We now describe an argument that leads from
Gaussβ criterion to the conclusion of Theorem 6.9. Let π and π be odd primes, and deο¬ne
ππ = π(2π β 1) MOD π
(1 β€ π β€
πβ1
)
2
and
π π = π(2π β 1) MOD π
(1 β€ π β€ πβ1
).
2
( )
( )
By Theorem 6.6, we have ππ = (β1)π‘ , where π‘ is the number of even ππ , and ππ = (β1)π’ ,
where π’ is the number of even π π . It follows that
( )( )
π
π
= (β1)π‘+π’ .
(6.1)
π
π
It therefore suο¬ces to show that π‘ + π’ is odd if and only if π β‘ π β‘ 3 (mod 4). We now
let π denote the set of all integers of the form π₯ = ππ β ππ, where π and π are odd integers
with 1 β€ π < π and 1 β€ π < π. For example, if π = 7 and π = 11, each element of π
has the form π₯ = 11π β 7π where π β {1, 3, 5} and π β {1, 3, 5, 7, 9}. Taking π = 1 gives
π₯ = 4, β10, β24, β38, β52, while π = 3 gives π₯ = 26, 12, β2, β16, β30, and ο¬nally π = 5
gives π₯ = 48, 34, 20, 6, β8, for a total of 15 elements.
MA 311
NUMBER THEORY
FALL 2008
37
Lemma 6.13. The elements of π are nonzero even integers, and one has
β£πβ£ = 14 (π β 1)(π β 1).
Proof. Suppose that π₯ = ππ β ππ β π. Then ππ and ππ are odd, so π₯ is clearly even.
Moreover, if ππ = ππ, then πβ£ππ, and hence πβ£π, which is impossible since 1 β€ π < π. Finally,
if ππ β ππ = ππβ² β ππβ² , then π(π β πβ² ) = π(π β πβ² ), which implies that πβ£(π β πβ² ) and hence that
π = πβ² and π = πβ² , since βπ < π β πβ² < π. Hence these expressions are all distinct, and β£πβ£ is
just the number of ordered pairs (π, π).
β‘
Next, we let π = {π β π : βπ < π < π}. For example, when π = 7 and π = 11, we have
π = {β10, β8, β2, 4, 6}.
Lemma 6.14. One has β£π β£ = π‘ + π’.
Proof. First suppose that π β π and that 0 < π < π. Then π β‘ ππ (mod π) for some odd
integer π with 1 β€ π < π, and we can write π = 2π β 1 with 1 β€ π β€ 21 (π β 1). But since π < π,
we must actually have π = ππ , and Lemma 6.13 shows that this is one of the even residues
counted by π‘. On the other hand, if ππ = π(2π β 1) MOD π is even, then 0 < ππ < π, and
ππ β‘ ππ (mod π), where π = 2π β 1 is odd and 1 β€ π < π. It then follows that ππ β ππ = ππ
for some π β β€, and clearly π must be odd and positive. Moreover, ππ < ππ < ππ and hence
π < π. This shows that ππ β π . We may therefore conclude that the elements π β π with
0 < π < π are precisely the even residues ππ counted by π‘. A similar argument shows that the
elements π β π with βπ < π < 0 are precisely the negatives of the even residues π π counted
by π’, and the lemma follows immediately.
β‘
To determine whether π‘ + π’ is even, we attempt to pair up the elements of π via the
correspondence
ππ β ππ Ãβ π(π β 1 β π) β π(π β 1 β π).
(6.2)
For example, when π = 7 and π = 11, we have 11π β 7π Ãβ 11(6 β π) β 7(10 β π), which gives
the pairs (4, β8), (β10, 6), and (β2, β2). On the other hand, if π = 5 and π = 7, then π =
{β18, β8, β4, 2, 6, 16} and π = {β4, 2}, so the correspondence 7πβ5π Ãβ 7(4βπ)β5(6βπ)
yields the obvious pair (2, β4). We now aim to show that this correspondence gives the
desired parity result for β£π β£.
Lemma 6.15. The pairs arising from the correspondence (6.2) consist of distinct elements
unless π β‘ π β‘ 3 (mod 4), in which case a single element is paired with itself.
Proof. We ο¬rst note that if ππ β ππ β π then one has
βπ = βπ + π β π < βπ + π β (ππ β ππ) < βπ + π + π = π,
which shows that π(π β 1 β π) β π(π β 1 β π) = βπ + π β (ππ β ππ) β π . Moreover, Lemma 6.13
shows that the expressions ππ β ππ are distinct, so if an element is paired with itself in (6.2),
we must have π = π β 1 β π and π = π β 1 β π, which gives π = 12 (π β 1) and π = 21 (π β 1).
But these values are both odd if and only if π β‘ π β‘ 3 (mod 4), and this completes the
proof.
β‘
The proof of quadratic reciprocity is now within our grasp. By Lemmas 6.14 and 6.15, we
see that β£π β£ = π‘ + π’ is odd if and only if π β‘ π β‘ 3 (mod 4), so the result follows from (6.1).
38
SCOTT T. PARSELL
The Jacobi symbol. There is a generalization of the Legendre symbol, called the Jacobi
symbol, that is deο¬ned whenever the bottom entry is odd. If π = π1 β
β
β
ππ , where the ππ are
(not necessarily distinct) primes, then we deο¬ne
( ) ( )
( )
π
π
π
=
β
β
β
,
π
π1
ππ
where the factors on the right are Legendre symbols. It turns out that the Jacobi symbol
enjoys many of the same properties as the Legendre symbol, including the law of quadratic
reciprocity.
Theorem 6.16. The results of Theorem 6.5 (ππ), (πππ), Corollary 6.8, and Theorem 6.9 hold
with the Legendre symbol replaced by the Jacobi symbol and the odd primes π and π replaced
by odd positive integers.
Proof. It suο¬ces to write out the prime factorizations of the odd integers in question and
apply the deο¬nition of the Jacobi symbol in combination with the corresponding properties
of the Legendre symbol. We leave the details as an exercise.
β‘
Note that part (iv) of Theorem 6.5 does not quite hold for the Jacobi symbol. The
correct analogue is that (π2 /π) = 1 if (π, π) = 1. Theorem 6.16 often allows us to perform
computations with Legendre symbols more eο¬ciently than was previously possible. For
instance, in Example 6.11, we could apply quadratic reciprocity for Jacobi symbols to obtain
( ) ( ) ( ) ( ) ( )
21
61
19
21
2
=
=
=
=
= β1
61
21
21
19
19
rather than dealing with (3/61) and (7/61) separately. Unfortunately, the Jacobi symbol
(π/π) does not tell us whether π is a square mod π. For example, (2/9) = (2/3)(2/3) = 1,
but 2 is not a square modulo 9.
7. Some diophantine equations
A diophantine equation usually refers to a polynomial equation with integer coeο¬cients to
which we seek integer solutions. As a simple example, consider the equation
9π₯ + 6π¦ = 20.
This is a linear diophantine equation in two variables. A momentβs thought reveals that this
equation has no integer solutions, since 9π₯+6π¦ is divisible by 3 for any integers π₯ and π¦ while
20 is not divisible by 3. From another point of view, notice that solving the above equation
is equivalent to solving the congruence 9π₯ β‘ 20 (mod 6), and we know from Theorem 3.8
that this has no solution since (9, 6) = 3 does not divide 20.
On the other hand, the equation 2π₯ + 3π¦ = 7 has inο¬nitely many integer solutions, given
by π₯ = β1 + 3π and π¦ = 3 β 2π for any π β β€. The following theorem characterizes the
solutions of the linear diophantine equation ππ₯ + ππ¦ = π.
Theorem 7.1. Let π, π, and π be integers, and write π = (π, π). The equation ππ₯ + ππ¦ = π
has integer solutions if and only if πβ£π. Moreover, the set of solutions is given by
π₯ = π₯0 + ππ/π,
where (π₯0 , π¦0 ) is any particular solution.
π¦ = π¦0 β ππ/π
(π β β€),
MA 311
NUMBER THEORY
FALL 2008
39
Proof. The equation ππ₯ + ππ¦ = π is equivalent to the congruence ππ₯ β‘ π (mod π), and
Theorem 3.8 shows that this is solvable if and only if (π, π) divides π. If π₯0 is any solution
of the congruence, then we have ππ₯0 = π β ππ¦0 for some integer π¦0 , so (π₯0 , π¦0 ) solves the
equation. Moreover, any solution (π₯, π¦) satisο¬es the congruences
π
π₯
π
β‘
π
π
(mod ππ )
and
π
π¦
π
β‘
π
π
(mod ππ ),
which have unique solutions modulo π/π and π/π, respectively. Therefore we have π₯ =
π₯0 + ππ/π and π¦ = π¦0 + ππ/π for some integers π and π. Substituting into the equation
ππ₯ + ππ¦ = π, we ο¬nd that (π₯, π¦) is a solution if and only if π = βπ.
β‘
Example 7.2. Describe all integer solutions of the diophantine equations 35π₯ + 49π¦ = 64
and 35π₯ + 49π¦ = 63.
Solution. In view of Theorem 7.1, the equation 35π₯ + 49π¦ = 64 has no integer solutions, but
the equation 35π₯ + 49π¦ = 63 has solutions π₯ = β1 + 7π and π¦ = 2 β 5π for every π β β€. β‘
Notice that the solubility of our linear diophantine equation was closely connected to the
solubility of the underlying congruences. This is a fairly general principle that is useful to
keep in mind when studying higher degree equations.
Example 7.3. Determine all integer solutions of the diophantine equation π₯2 + π¦ 2 = 1999.
Solution. Notice that 0 and 1 are the only perfect squares modulo 4, and no two of these
add up to 3, which is congruent to 1999 modulo 4. We therefore conclude that the equation
has no integer solutions.
β‘
Example 7.4. Determine all integer solutions of the equation π₯2 + 7π¦ 2 + 35π§ 2 = 70493.
Solution. If (π₯, π¦, π§) were a solution, then π₯ would satisfy the congruence π₯2 β‘ 70493 β‘ 3
(mod 7). But 3 is a quadratic non-residue modulo 7, so we conclude that there are no integer
solutions.
β‘
Pythagorean triples. A famous quadratic diophantine equation in three variables is the
Pythagorean equation
π₯2 + π¦ 2 = π§ 2 .
(7.1)
Notice that this equation has many βtrivialβ solutions, (0, π¦, ±π¦) and (π₯, 0, ±π₯), obtained
by setting one of the variables on the left hand side equal to zero. These solutions are
not very interesting. Of course, there are some well-known right triangles with integer side
lengths, which give non-trivial solutions such as (3, 4, 5) and (5, 12, 13). A solution to (7.1)
is sometimes called a Pythagorean triple.
The equation (7.1) also has a special property called homogeneity, which means that if
(π₯, π¦, π§) is a solution, then so is (ππ₯, ππ¦, ππ§) for any integer π. For this reason, we usually
restrict attention to the so-called primitive solutions, in which π₯, π¦, and π§ have no non-trivial
common factors. It turns out that we can express all primitive solutions of this equation as
a two-parameter family. It is easy to show that in any primitive Pythagorean triple we must
have π§ odd and either π₯ or π¦ even. By interchanging π₯ and π¦ if necessary, we may suppose
without loss of generality that π₯ is even.
40
SCOTT T. PARSELL
Theorem 7.5. If (π₯, π¦, π§) is a primitive Pythagorean triple, where π₯ is even and π₯, π¦, and
π§ are positive, then
π₯ = 2π π‘,
π¦ = π 2 β π‘2 ,
and
π§ = π 2 + π‘2 ,
for some relatively prime positive integers π and π‘. Conversely, if π and π‘ are relatively prime,
π > π‘ > 0, and π or π‘ is even, then (2π π‘, π 2 β π‘2 , π 2 + π‘2 ) is a primitive Pythagorean triple.
Proof. Let (π₯, π¦, π§) be a positive primitive Pythagorean triple with π₯ even and π¦ and π§ odd.
Then we have
π₯2 = π§ 2 β π¦ 2 = (π§ + π¦)(π§ β π¦),
and both π§ + π¦ and π§ β π¦ are even, so we can write
)(
)
( )2 (
π§+π¦
π§βπ¦
π₯
=
,
2
2
2
where all three factors are integers. Any common divisor of (π§ + π¦)/2 and (π§ β π¦)/2 would
have to divide their sum and diο¬erence, π§ and π¦, but we know that π§ and π¦ are relatively
prime and hence so are (π§ + π¦)/2 and (π§ β π¦)/2. It follows easily that both (π§ + π¦)/2 and
(π§ β π¦)/2 must be perfect squares, say
π§+π¦
π§βπ¦
= π 2
and
= π‘2 .
2
2
2
2
2
2
The equations π₯ = 2π π‘, π¦ = π β π‘ , and π§ = π + π‘ follow immediately. Conversely, it is
easy to check that
(2π π‘)2 + (π 2 β π‘2 )2 = (π 2 + π‘2 )2 .
Moreover, any odd prime dividing both 2π π‘ and π 2 β π‘2 would have to divide either π or π‘
and either π + π‘ or π β π‘, and in all of these cases the prime would divide both π and π‘. Thus
if (π , π‘) = 1 then the above triple is primitive.
β‘
Example 7.6. Find all positive primitive Pythagorean triples with one of the variables equal
to 15.
Solution. Since 15 is odd and is not the sum of two squares, Theorem 7.5 implies that
π¦ = π 2 β π‘2 is the only variable that could take the value 15. So we seek positive integers
π > π‘ such that π 2 β π‘2 = (π + π‘)(π β π‘) = 15. Clearly, the only possibilities are
π + π‘ = 15,
π βπ‘=1
and
π + π‘ = 5,
π β π‘ = 3,
which yield π = 8, π‘ = 7 and π = 4, π‘ = 1. Hence the only Pythagorean triples of this type
are (112, 15, 113) and (8, 15, 17).
β‘
Theorem 7.7. The equation π₯4 + π¦ 4 = π§ 2 has no integer solutions with π₯π¦π§ β= 0.
Proof. If (π₯, π¦, π§) is a solution with gcd(π₯, π¦) = π, then π§ 2 is divisible by π4 and hence π§
is divisible by π2 , so we obtain a new solution (π₯/π, π¦/π, π§/π2 ) with the ο¬rst two variables
relatively prime. Therefore we may suppose that (π₯, π¦, π§) is a solution with π₯, π¦, and π§
positive, gcd(π₯, π¦) = 1 and π§ as small as possible. We will show how to construct a solution
with a smaller value of π§, thereby producing a contradiction.
MA 311
NUMBER THEORY
FALL 2008
41
Since (π₯2 , π¦ 2 , π§) is a positive primitive Pythagorean triple, we may apply Theorem 7.5
(after possibly interchanging π₯ and π¦) to write
π₯2 = 2π π‘,
π¦ 2 = π 2 β π‘2 ,
and
π§ = π 2 + π‘2 ,
where π > π‘ > 0 and gcd(π , π‘) = 1. Since π¦ is odd, it follows that π is odd and π‘ is even, so
we in fact have gcd(π , 2π‘) = 1, and thus π and 2π‘ are both perfect squares, say
π = π’2
and
2π‘ = π£ 2 .
Furthermore, we have π‘2 + π¦ 2 = π 2 , so (π‘, π¦, π ) is another primitive Pythagorean triple, and
we can apply Theorem 7.5 again to write
π‘ = 2ππ,
π¦ = π 2 β π 2,
and
π = π 2 + π 2,
where π > π > 0 and gcd(π, π ) = 1. We now have ππ = π‘/2 = (π£/2)2 , which implies that
π and π are both perfect squares, say π = π 2 and π = π 2 . But now
π 4 + π 4 = π 2 + π 2 = π = π’2
and
π’2 = π < (π 2 + π‘2 )2 = π§ 2 ,
so π’ < π§, and taking π = π’ gives a new solution (π, π, π) with π < π§.
β‘
Corollary 7.8. The equation π₯4 + π¦ 4 = π§ 4 has no integer solutions with π₯π¦π§ β= 0.
Proof. If (π₯, π¦, π§) were a solution with π₯π¦π§ β= 0, then we would have π₯4 + π¦ 4 = (π§ 2 )2 ,
contradicting Theorem 7.7.
β‘
Theorem 7.9. (Fermatβs Last Theorem) If π is an integer with π β₯ 3 is an integer, then
the equation π₯π + π¦ π = π§ π has no integer solutions with π₯π¦π§ β= 0.
Note that this follows easily from Corollary 7.8 when π is a multiple of 4. The proof
for arbitrary π is extremely hard and was just completed by Wiles in 1995. The following
symmetric generalization of Fermatβs Last Theorem is still unsolved.
Conjecture 7.10. If π is an integer with π β₯ 5, then the equation π₯π + π¦ π = π§ π + π€π has
no non-trivial integer solutions.
Notice that there are non-trivial solutions to this equation when π = 2, 3, and 4. For
instance, one has
12 + 72 = 52 + 52 ,
13 + 123 = 93 + 103 ,
and 1334 + 1344 = 1584 + 594 .
Equations in βmanyβ variables. A general theme illustrated above is that diophantine
equations in few variables (relative to the degree) tend to have few, if any, non-trivial solutions. Conversely, equations in suο¬ciently many variables (relative to the degree) tend to
have many non-trivial solutions. One of the most interesting problems here is to try to quantify the phrase βsuο¬ciently many.β As an example, we look at the problem of representing
integers as sums of πth powers.
Theorem 7.11. (Lagrangeβs four squares theorem) Every positive integer can be written as the sum of four squares.
42
SCOTT T. PARSELL
For example, we have 31 = 52 + 22 + 12 + 12 and 120 = 102 + 42 + 22 + 02 . We leave it as an
exercise to show that there are inο¬nitely many positive integers that cannot be represented
as sums of three squares.
Lemma 7.12. If π and π are sums of four squares, then so is ππ.
Proof. Suppose that π = π₯2 + π¦ 2 + π§ 2 + π€2 and π = π2 + π2 + π2 + π2 . Then it is easy (but
somewhat tedious) to verify that
ππ = (π₯π + π¦π + π§π + π€π)2 + (π₯π β π¦π + π§π β π€π)2 + (π₯π β π§π + π€π β π¦π)2 + (π₯π β π€π + π¦π β π§π)2 .
We leave this algebra as an exercise.
β‘
Lemma 7.13. If π is an odd prime, then there exist integers π₯, π¦, and π, with 0 < π < π,
such that
π₯2 + π¦ 2 + 1 = ππ.
Proof. It suο¬ces to ο¬nd integers π₯ and π¦ with
π₯2 + π¦ 2 + 1 β‘ 0 (mod π)
and
π₯2 + π¦ 2 + 1 < π 2 .
(7.2)
We divide the proof into two cases.
( )
If π β‘ 1 (mod 4), then Theorem 6.5 (ii) implies that β1
= 1, so β1 is a quadratic
π
residue modulo π. Therefore we can ο¬nd π₯ with 0 < π₯ < π/2 such that π₯2 β‘ β1 (mod π),
and (7.2) is satisο¬ed with π¦ = 0.
( )
= β1. Now let π be the
If π β‘ 3 (mod 4), then Theorem 6.5 (ii) implies that β1
π
smallest quadratic non-residue modulo π. Then we have
( ) ( )( )
βπ
β1
π
=
= (β1)(β1) = 1
π
π
π
by Theorem 6.5 (iii), so βπ is a quadratic residue modulo π. Therefore we can ο¬nd π₯ with
0 < π₯ < π/2 such that π₯2 β‘ βπ (mod π). Furthermore, the minimality of π ensures that
π β 1 is a quadratic residue modulo π, so we can ο¬nd π¦ with 0 < π¦ < π/2 such that π¦ 2 β‘ π β 1
(mod π). It is easy to check that π₯ and π¦ satisfy (7.2), so this completes the proof.
β‘
Proof of Lagrangeβs Theorem: In view of Lemma 7.12 and the fact that 2 = 12 +12 +02 +02 ,
it suο¬ces to prove that every odd prime π is the sum of four squares. By Lemma 7.13, we
can ο¬nd integers π₯, π¦, π§, and π€ such that
π₯2 + π¦ 2 + π§ 2 + π€2 = ππ
(7.3)
for some positive integer π < π. For instance, take π₯ and π¦ as in the lemma, π§ = 1, and
π€ = 0. We employ a descent argument to show that we can ο¬nd a solution to (7.3) with
π = 1. To do this, we suppose that we have a solution with π > 1 and demonstrate how to
construct a solution with a smaller value of π. First of all, if π is even, then an even number
of the variables on the left-hand side are odd, so by relabeling if necessary we may suppose
that π₯ ± π¦ and π§ ± π€ are even, and
)2 (
)2 (
)2 (
)2
(
π₯βπ¦
π§+π€
π§βπ€
π₯+π¦
+
+
+
= (π/2)π.
2
2
2
2
MA 311
NUMBER THEORY
FALL 2008
43
If π/2 is even, then we can repeat the argument until we obtain a solution to (7.3) with π
odd, so we may suppose from now on that π is odd. Now let π, π, π, and π denote the least
absolute value residues of π₯, π¦, π§, and π€ modulo π. That is,
π β‘ π₯ (mod π),
π β‘ π¦ (mod π),
π β‘ π§ (mod π),
π β‘ π€ (mod π),
where β£πβ£, β£πβ£, β£πβ£, β£πβ£ < π/2 since π is odd. Then we have
π2 + π2 + π2 + π2 β‘ π₯2 + π¦ 2 + π§ 2 + π€2 β‘ 0 (mod π),
so we can write π2 + π2 + π2 + π2 = ππ for some integer π, and π2 + π2 + π2 + π2 < π 2 , so
we have π < π. If π = 0 then we would have π = π = π = π = 0, which would imply that
ππ = π₯2 + π¦ 2 + π§ 2 + π€2 is divisible by π 2 . This cannot occur when 1 < π < π since π is prime,
so we conclude that π > 0. Now by the proof of Lemma 7.12 we can write
(ππ)(ππ) = π 2 + π 2 + π 2 + π 2 ,
where
π = π₯π + π¦π + π§π + π€π,
π = π₯π β π¦π + π§π β π€π,
π = π₯π β π§π + π€π β π¦π,
π = π₯π β π€π + π¦π β π§π,
and it is easy to check that π, π , π, and π are each divisible by π. It follows that
(π/π)2 + (π /π)2 + (π/π)2 + (π/π)2 = ππ,
which gives a solution of (7.3) with 0 < π < π. This completes the descent and shows that
there is in fact a solution with π = 1. β‘
Waringβs problem. One might ask whether similar results exist for higher powers. That
is, given a positive integer π, can we ο¬nd a positive integer π such that all positive integers
π can be written in the form
π = π₯π1 + π₯π2 + β
β
β
+ π₯ππ
(7.4)
for some non-negative integers π₯1 , π₯2 , . . . , π₯π ? This question was posed by Waring in 1770
(around the same time as Lagrangeβs Theorem was proved) and has received considerable
attention over the past century. The original version of the problem seeks to determine π(π),
which is deο¬ned to be the smallest integer π such that the above equation can be solved for
every positive integer π. For example, one has π(2) = 4. It is also known that π(3) = 9,
π(4) = 19, and π(5) = 37 and that
β( )π β
3
π
π(π) β₯ 2 +
β2
(7.5)
2
for all π, where βπ₯β denotes the greatest integer less than or equal to π₯. Notice that the
integer 23 really does require 9 cubes in order to achieve a representation. Since 23 < 33
and 23 < 23 + 23 + 23 , the most eο¬cient decomposition is
23 = 23 + 23 + 13 + 13 + 13 + 13 + 13 + 13 + 13 .
Amazingly, it turns out that 23 and 239 are the only two integers that actually require 9
cubes. In fact, there are only ο¬nitely many integers that require 8 cubes, and it follows
that every suο¬ciently large integer can be expressed as the sum of 7 cubes. In general, we
44
SCOTT T. PARSELL
deο¬ne πΊ(π) to be the smallest integer π such that every suο¬ciently large integer π can be
represented in the form (7.4). For example, it is known that
πΊ(2) = 4,
4 β€ πΊ(3) β€ 7,
πΊ(4) = 16,
6 β€ πΊ(5) β€ 17,
and 9 β€ πΊ(6) β€ 24.
It turns out that πΊ(π) grows much slower than π(π) as π β β, reο¬ecting the fact that the
representation of small integers poses some unusual diο¬culties that do not persist in the
long run. In fact, it was shown by Wooley in 1992 that πΊ(π) grows no faster than π log π
asymptotically, whereas (7.5) shows that the growth of π(π) is exponential in π. In the 1920s,
Hardy and Littlewood devised a method for counting the number of representations of π in
the form (7.4) by using a deο¬nite integral. Reο¬nements of this strategy due to Vinogradov,
Davenport, Vaughan, Woooley, and others have led to the sharpest available upper bounds
for πΊ(π) when π β₯ 3. Notice that even in the cubic case, the existing technology still leaves
fairly large gaps between what is conjectured and what can be proved!
We give
β a very brief outline of the Hardy-Littlewood method. When πΌ is a real number
and π = β1, write
π(πΌ) = π2πππΌ = cos(2ππΌ) + π sin(2ππΌ).
If π is an integer, then it is easy to verify the orthogonality relations
{
β« 1
β« 1
β« 1
1 if π = 0
sin(2ππΌπ) ππΌ =
cos(2ππΌπ) ππΌ + π
.
π(πΌπ) ππΌ =
0 if π β= 0
0
0
0
If we let π = βπ1/π β and introduce the exponential sum
π (πΌ) =
π
β
π(πΌπ₯π ),
π₯=1
then the fact that π(π)π(π) = π(π + π) gives
β« 1
π
π β«
β
β
π
π (πΌ) π(βπΌπ) ππΌ =
β
β
β
0
π₯1 =1
π₯π =1
1
0
π(πΌ(π₯π1 + β
β
β
+ π₯ππ β π)) ππΌ,
and the orthogonality relations show that each term in the sum is 1 or 0 according to whether
or not π₯π1 + β
β
β
+ π₯ππ = π. The integral on the left therefore counts the representations of π
in this form, and demonstrating the existence of representations amounts to showing that
the integral is positive. This is a non-trivial task that involves dissecting the interval [0, 1]
into two subsets according to the nature of the rational approximations to πΌ and applying
several types of estimates for the exponential sum π (πΌ). Notice that as the real variable πΌ
runs from 0 to 1, the complex variable π§ = π2πππΌ traces out the unit circle β£π§β£ = 1. The
original set-up devised by Hardy and Littlewood actually takes the latter perspective, using
integrals over circles in the complex plane. For this reason, the technique is often referred to
as the circle method, and the two subsets mentioned above are called major and minor arcs.
8. Irrationality and transcendence
β
We have already seen in §2 that irrational numbers exist; for instance, 2 ββ β. In fact,
almost all real numbers are irrational, since the rationals form a countable set while the
reals are uncountable. On the other hand, given any two real numbers πΌ < π½, we can ο¬nd a
rational number lying between them. To see this, let π be an integer with π > 1/(π½ β πΌ),
so that ππ½ β ππΌ > 1. Clearly there must be an integer π between ππΌ and ππ½, and it
MA 311
NUMBER THEORY
FALL 2008
45
follows that the rational number π/π lies between πΌ and π½. In particular, by choosing π½
suο¬ciently close to πΌ, we can ο¬nd a rational number that approximates πΌ to any desired
degree of accuracy. This property is often expressed by saying that the rationals are dense
in the reals. In number theory, we often desire more quantitative information about rational
approximations. For instance, how does the quality of the approximation improve as we
allow the denominator to increase? This is the type of information that determines how we
dissect into major and minor arcs in the Hardy-Littlewood method. One simple answer is
given by the following theorem.
Theorem 8.1. (Dirichletβs theorem on diophantine approximation) Given a real
number πΌ and a positive integer π , there exist integers π and π with (π, π) = 1 and 1 β€ π β€
π β 1 such that
¯
¯
¯
¯
π
¯πΌ β ¯ β€ 1 .
¯
π ¯ ππ
Proof. It suο¬ces to prove the result for πΌ β [0, 1], since the general case can then be obtained
by replacing π/π by βπΌβ+π/π. We divide the interval [0, 1] into π subintervals, each of length
1/π , and consider the values of ππΌ β βππΌβ as π runs over the integers 1, 2, 3, . . . , π β 1. First
of all, if ππΌ β βππΌβ lies in the interval [0, 1/π ] for some π, then taking π = βππΌβ gives
β£ππΌ β πβ£ β€ 1/π . Similarly, if ππΌ β βππΌβ lies in the interval [1 β 1/π, 1] for some π, then taking
π = βππΌβ + 1 gives β£ππΌ β πβ£ β€ 1/π . If none of these π β 1 values lies in the ο¬rst or last
subinterval, then the pigeonhole principle ensures that two of them must lie in one of the
remaining π β 2 subintervals. That is, we have
β£(π2 πΌ β βπ2 πΌβ) β (π1 πΌ β βπ1 πΌβ)β£ β€ 1/π
for some integers π1 and π2 with 1 β€ π1 < π2 β€ π β1. Taking π = π2 βπ1 and π = βπ2 πΌβββπ1 πΌβ
again gives β£ππΌ β πβ£ β€ 1/π . Finally, if (π, π) = π then setting πβ² = π/π and π β² = π/π gives
(π β² , πβ² ) = 1 and β£π β² πΌ β πβ² β£ β€ 1/(ππ ) β€ 1/π , which completes the proof.
β‘
Corollary 8.2. If πΌ is an irrational number, then there are inο¬nitely many rational numbers
π/π for which
¯
¯
¯
¯
π
¯πΌ β ¯ < 1 .
¯
π ¯ π2
Proof. If there were only ο¬nitely many such rational approximations to πΌ, then we could
ο¬nd one, say π/π, with πΏ = β£πΌ β π/πβ£ minimal. Since πΌ ββ β, we have πΏ > 0, so we may let
π = β1/πΏβ + 1 > 1/πΏ. By Theorem 8.1, we can ο¬nd a rational number π/π with 1 β€ π < π
and
¯
¯
¯
¯
¯πΌ β π ¯ β€ 1 < 1 .
¯
π ¯ ππ
π2
Since 1/(ππ ) < πΏ, this contradicts the minimality of πΏ.
β‘
Note that if πΌ is rational then the inequality in Corollary 8.2 has only ο¬nitely many
solutions. To see this, write πΌ = π/π and note that if π/π β= π/π then we have
¯
¯
¯ π π ¯ β£ππ β ππβ£
1
1
¯ β ¯=
β₯
β₯ 2
¯π
¯
π
ππ
ππ
π
46
SCOTT T. PARSELL
whenever π β₯ π, so the only possible solutions come from 1 β€ π < π. A theorem of Hurwitz
shows that there are in fact inο¬nitely many solutions of
¯
¯
¯
¯
¯πΌ β π ¯ < β 1
¯
π¯
5π 2
when πΌ is irrational.
This turns out to be best possible in the sense that the result fails
β
β if
the constant 1/ 5 is replaced by anything smaller. However, the golden ratio πΌ = 21 (1 + 5)
provides the only counterexample!
Continued fractions. One way of generating good rational approximations to an irrational number πΌ is to construct the continued fraction expansion
1
πΌ = π₯0 +
.
1
π₯1 +
1
π₯2 +
1
π₯3 +
π₯4 + . . .
To save space, this is sometimes denoted by πΌ = [π₯0 ; π₯1 , π₯2 , π₯3 , π₯4 , . . . ]. We can construct
continued fractions for rational numbers as well, but in this case the expansion is ο¬nite.
125
as a ο¬nite continued fraction.
54
Solution. We ο¬rst split oο¬ the integer part by writing
125
17
=2+ .
54
54
Next we take the reciprocal of the fractional part and repeat the process. We have
54
3
17
2
3
1
=3+ ,
= 5 + , and
=1+ .
17
17
3
3
2
2
Thus we have
125
1
= [2; 3, 5, 1, 2].
=2+
1
54
3+
1
5+
1
1+
2
Example 8.3. Express the rational number
β‘
Example 8.4. What real number is represented by the continued fraction [1; 1, 1, 1, 1, . . . ]?
Solution. If πΌ = [1; 1, 1, 1, 1, . . . ] then we have
1
1
πΌ=1+
=1+ .
1
πΌ
1+
1 + ...
2
It follows that πΌ β πΌ β 1 = 0, and since πΌ is clearly positive we may conclude that
β
1+ 5
.
πΌ=
2
β‘
MA 311
NUMBER THEORY
FALL 2008
47
To generate the continued fraction for πΌ, we ο¬rst take π₯0 = βπΌβ and then write
πΌ1 =
1
πΌ β π₯0
and
π₯1 = βπΌ1 β.
In general, if πΌπ and π₯π have been deο¬ned, we take
πΌπ+1 =
1
πΌπ β π₯π
and
π₯π+1 = βπΌπ+1 β.
β
Example 8.5. Compute the continued fraction for 2.
β
Solution. First of all, we have π₯0 = β 2β = 1. Next, we have
πΌ1 = β
β
1
= 2 + 1,
2β1
and hence π₯1 = 2. Furthermore,
πΌ2 =
1
1
=β
= πΌ1 ,
πΌ1 β 2
2β1
and hence π₯2 = 2. Since
β πΌπ+1 depends only on πΌπ and π₯π , we can conclude that π₯π = 2 for
all π β₯ 1. Therefore, 2 = [1; 2, 2, 2, 2, . . . ] = [1; 2].
β‘
By truncating
the continued fraction obtained above, we can obtain rational approximaβ
tions to 2, for instance
π1
1
3
=1+ = ,
π1
2
2
π2
1
=1+
π2
2+
1
2
7
= ,
5
and
π3
1
17
=1+
= .
1
π3
12
2 + 2+ 1
2
The rational number ππ /ππ is called the πth convergent to πΌ, and the integer π₯π is called the
πth partial quotient of πΌ. It turns out that the convergents satisfy some simple recurrence
relations, which make them easy to compute once the partial quotients are known.
Theorem 8.6. If πΌ has the continued fraction expansion [π₯0 ; π₯1 , π₯2 , π₯3 , . . . ], then the πth
convergent to πΌ is the rational number ππ /ππ deο¬ned by recurrence relations
ππ = π₯π ππβ1 + ππβ2
and
ππ = π₯π ππβ1 + ππβ2
(π β₯ 0),
where we take πβ1 = 1, πβ1 = 0, πβ2 = 0, and πβ2 = 1.
Proof. We regard the convergents ππ /ππ as functions of the partial quotients. That is,
ππ = ππ (π₯0 , π₯1 , . . . , π₯π ) and ππ = ππ (π₯0 , π₯1 , . . . , π₯π ). The result is clear for π = 0, since
the recursions give π0 = π₯0 and π0 = 1. Now suppose that [π₯0 ; π₯1 , . . . , π₯πβ1 ] = ππβ1 /ππβ1 .
Then we can write
[π₯0 ; π₯1 , . . . , π₯πβ1 , π₯π ] = [π₯0 ; π₯1 , . . . , π₯πβ1 +
1
]
π₯π
=
ππβ1 (π₯0 , π₯1 , . . . , π₯πβ1 +
ππβ1 (π₯0 , π₯1 , . . . , π₯πβ1 +
1
)
π₯π
1 .
)
π₯π
48
SCOTT T. PARSELL
Applying the above recurrence relations, we obtain
[π₯0 ; π₯1 , . . . , π₯πβ1 , π₯π ] =
(π₯πβ1 +
1
)ππβ2
π₯π
1
)π
π₯π πβ2
+ ππβ3
(π₯πβ1 +
+ ππβ3
π₯π π₯πβ1 ππβ2 + ππβ2 + π₯π ππβ3
=
π₯π π₯πβ1 ππβ2 + ππβ2 + π₯π ππβ3
π₯π (π₯πβ1 ππβ2 + ππβ3 ) + ππβ2
π₯π ππβ1 + ππβ2
ππ
=
=
= .
π₯π (π₯πβ1 ππβ2 + ππβ3 ) + ππβ2
π₯π ππβ1 + ππβ2
ππ
The result follows by induction.
β‘
β
Example 8.7. Find the continued fraction expansion for 29, and compute the ο¬rst 6
convergents.
β
Solution. We have π₯0 = β 29β = 5, and thus
β
β
1
29 + 5
29 β 3
πΌ1 = β
=
=2+
.
4
4
29 β 5
It follows that π₯1 = 2 and
β
β
29 + 3
29 β 2
4
πΌ2 = β
=
=1+
.
5
5
29 β 3
This in turn gives π₯2 = 1 and
β
β
5
29 + 2
29 β 3
πΌ3 = β
=1+
,
=
5
5
29 β 2
which yields π₯3 = 1 and
β
β
29 + 3
29 β 5
5
πΌ4 = β
=2+
.
=
4
4
29 β 3
Now we have π₯4 = 2 and
β
β
4
πΌ5 = β
= 29 + 5 = 10 + ( 29 β 5),
29 β 5
and from this we see that π₯5 = 10 and πΌ6 = πΌ
β1 , which means that the continued fraction becomes periodic. We therefore conclude that 29 = [5; 2, 1, 1, 2, 10], and we can use Theorem
8.6 to compute the convergents. We have π0 = 5, π1 = 2 β
5 + 1 = 11, π2 = 1 β
11 + 5 = 16,
π3 = 1 β
16 + 11 = 27, π4 = 2 β
27 + 16 = 70, and π5 = 10 β
70 + 27 = 727. Similarly,
we get π0 = 1, π1 = 2, π2 = 1 β
2 + 1 = 3, π3 = 1 β
3 + 2 = 5, π4 = 2 β
5 + 3 = 13, and
π5 = 10 β
13 + 5 = 135. Hence the ο¬rst 6 convergents are
11
16
27
70
727
5,
,
,
,
, and
.
2
3
5
13
135
β‘
Algebraic and transcendental numbers. A real number that is a root of a non-trivial
polynomial with integer coeο¬cients is said to be algebraic. More precisely, if πΌ is a root of a
polynomial of degree π with integer coeο¬cients that is irreducible over β, then we say that πΌ
is algebraic of degree π. Note that any rational number π/π is algebraic of degree one, since
MA 311
NUMBER THEORY
FALL 2008
49
β
it is a root of the polynomial π (π₯) = ππ₯ β π. Any real number of the form π ± π π, where
π, π, and
β π are rational and π is not a perfect square, is algebraic 2of degree two. For instance,
1
(1+
5) is algebraic of degree two, since it is a root of π (π₯) = π₯ βπ₯β1. Algebraic numbers
2
of degree two are sometimes called quadratic irrationals. It turns out that a number is a
quadratic irrational if and only if it has an eventually periodic continued fraction expansion.
The set of algebraic numbers is closed under addition
and
β
β multiplication, but the set of
algebraic
numbers
of
degree
π
is
not.
For
instance,
2
and
3 are algebraic of degree 2, but
β
β
β β
2 + 3 is algebraic of degree 4 and 2 β
2 = 2 is algebraic of degree 1.
β
β
Example 8.8. Prove that πΌ = 2 + 3 is algebraic.
β
β
Solution. First of all, we have πΌ2 = 2 + 2 6 + 3, and hence πΌ2 β 5 = 2 6. Squaring both
sides gives πΌ4 β 10πΌ2 + 25 = 24, or πΌ4 β 10πΌ2 + 1 = 0. Thus πΌ is a root of the polynomial
π (π₯) = π₯4 β 10π₯2 + 1 and hence is algebraic of degree at most 4. One can in fact show that
π is irreducible over β and hence that πΌ is algebraic of degree 4.
β‘
Real numbers that are not algebraic are called transcendental. Probably the two most
famous transcendental numbers are π and π. Proving the transcendence of π and π is beyond
the scope of the course; however, it is not too diο¬cult to show that π is irrational.
Theorem 8.9. The number π is irrational.
Proof. Suppose to the contrary that π is rational, say π = π/π, where π and π are integers
with π β₯ 1. We recall that π can be expressed as the inο¬nite series
β
β
1
π=
.
π!
π=0
Let π β₯ 2π be an integer, and let ππ denote the πth partial sum of this series; that is,
π
β
1
1 1
1
1
ππ =
=1+1+ + +
+ β
β
β
+ .
π!
2 6 24
π!
π=0
Clearly ππ is rational, and we can write ππ = π/π! for some integer π. Moreover, we have
π > ππ , and thus
π
π
ππ! β ππ
1
π β ππ = β
=
β₯
.
π π!
ππ!
ππ!
On the other hand, we have
β
β
1
1
1
1
π β ππ =
=
+
+
+ ...
π!
(π
+
1)!
(π
+
2)!
(π
+
3)!
π=π+1
)
(
1
1
1
1
2
1
<
=
β
β€
1 + + 2 + ...
(π + 1)!
π π
(π + 1)! 1 β 1/π
(π + 1)!
since π β₯ 2. Combining our two inequalities, we obtain
1
2
β€ π β ππ β€
,
ππ!
(π + 1)!
which implies that π β€ 2π β 1, a contradiction.
β‘
50
SCOTT T. PARSELL
The idea of the preceding proof may be summarized by saying that π has rational approximations (namely ππ ) that are βtoo goodβ to allow π to be rational, since two distinct
rationals diο¬er by at least the reciprocal of the product of the denominators. The following
theorem may be viewed as a generalization of this idea. It states that algebraic numbers
cannot have fantastically good rational approximations.
Theorem 8.10. (Liouvilleβs Theorem) Suppose that πΌ is an algebraic number of degree
π β₯ 2. Then there exists a positive constant ππΌ such that
¯
¯
¯
¯
¯πΌ β π ¯ > ππΌ
¯
π ¯ ππ
for all integers π and π with π β₯ 1.
Proof. Suppose that πΌ is a root of the irreducible polynomial
π (π₯) = ππ π₯π + ππβ1 π₯πβ1 + β
β
β
+ π1 π₯ + π0 ,
where π β₯ 2, and let π and π be integers with π β₯ 1. First of all, we note that π (π/π) β= 0,
since π is irreducible of degree at least two. Furthermore, it is clear that π π π (π/π) is
an integer and hence that π π β£π (π/π)β£ β₯ 1. Since πΌ is a root of π , we may write π (π₯) =
(π₯βπΌ)π(π₯), where π is a polynomial of degree π β1, not necessarily with integer coeο¬cients.
Since π is a continuous function, we know that it attains maximum and minimum values on
any closed, bounded interval. Therefore, there exists ππΌ > 0 such that β£π(π₯)β£ β€ ππΌ for all
π₯ β [πΌ β 1, πΌ + 1]. We set ππΌ = (1 + ππΌ )β1 and consider two cases. If β£πΌ β π/πβ£ β€ 1, then
we have
π βπ β€ β£π (π/π)β£ β€ β£πΌ β π/πβ£β£π(π/π)β£ β€ β£πΌ β π/πβ£ππΌ < β£πΌ β π/πβ£πβ1
πΌ ,
which gives β£πΌ β π/πβ£ > ππΌ π βπ , as required. If β£πΌ β π/πβ£ > 1, then the desired inequality
follows from the observation that ππΌ β€ 1.
β‘
Example 8.11. Find an admissible value for ππΌ in Liouvilleβs Theorem when πΌ =
β
3
2.
Solution. In the notation of the above proof, we have
π (π₯) = π₯3 β 2 = (π₯ β πΌ)(π₯2 + πΌπ₯ + πΌ2 ) = (π₯ β πΌ)π(π₯).
Since πβ² (π₯) = 2π₯ + πΌ, we ο¬nd that π is increasing on the interval [πΌ β 1, πΌ + 1] and hence
that π(πΌ β 1) β€ π(π₯) β€ π(πΌ + 1) for all π₯ in the interval [πΌ β 1, πΌ + 1]. Since π(πΌ β 1) > 0
and π(πΌ + 1) = 3πΌ2 + 3πΌ + 1 < 9.542, we can take ππΌ = 9.542 and thus any ππΌ < (10.542)β1
is admissible. For example, one has
¯
¯
¯β
¯
π
3
¯ 2β ¯> 1
¯
π ¯ 11π 3
for all integers π and all positive integers π.
β‘
One might hope that the proof of Theorem 8.9 could be modiο¬ed to show that π is
transcendental using the contrapositive of Liouvilleβs Theorem. However, the quality of the
rational approximations ππ is not suο¬cient to make this argument work. We note that ππ has
denominator π = π!, but 2/(π + 1)! > 1/(π!)2 = 1/π 2 so the inequality β£π β ππ β£ < 2/(π + 1)!
doesnβt even rule out the possibility that π is a quadratic irrational! Therefore a more
MA 311
NUMBER THEORY
FALL 2008
51
sophisticated argument is required to prove that π is transcendental. However, we can
establish the existence of transcendental numbers by working with a series that converges
much faster.
Theorem 8.12. The number πΌ =
β
β
10βπ! = 0.11000100000000000000000100000000..... is
π=1
transcendental.
π
ππ β βπ!
Proof. We write
=
10 , where ππ = 10π! . We then have
ππ
π=1
¯
¯
β
β
¯
¯
¯πΌ β ππ ¯ =
10βπ! = 10β(π+1)! + 10β(π+2)! + 10β(π+3)! + . . .
¯
¯
ππ
π=π+1
(
) 10
10
< 10β(π+1)! 1 + 10β1 + 10β2 + . . . =
β
10β(π+1)! = ππβ(π+1) .
9
9
If πΌ is algebraic of degree π β₯ 2, then Liouvilleβs Theorem implies that there is a constant
π > 0 such that β£πΌ β ππ /ππ β£ > πππβπ for all π. This statement holds for π = 1 as well since
πΌ β= ππ /ππ and hence πΌ = π/π =β β£πΌ β ππ /ππ β£ β₯ (πππ )β1 , whence we can take π = 1/(π + 1).
Thus if πΌ is algebraic of degree π we have
10 β(π+1)
πππβπ < β£πΌ β ππ /ππ β£ <
π
,
9 π
and thus πππ+1βπ < 10/(9π). But ππ β β as π β β, so we obtain a contradiction by taking
π suο¬ciently large in terms of π and π.
β‘
Some open questions. A real number πΌ is said to be badly approximable if there is a
positive constant ππΌ such that β£πΌβπ/πβ£ > ππΌ π β2 for all integers π and π with π β₯ 1. Liouvilleβs
Theorem shows that all algebraic numbers of degree two (i.e., all quadratic irrationals) are
badly approximable. It is conjectured that no algebraic numbers of degree greater than
two are badly approximable, but this has not been proven. It turns out that a number is
badly approximable if and only if the partial quotients in its continued fraction expansion
are bounded. For instance, π is not badly approximable, for it can be shown that
π = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, 1, 1, 12, . . . ].
By contrast, it is unknown whether π is badly approximable (the conjecture is that itβs
not). We do know that most real numbers are not badly approximable, in the sense that the
badly approximable numbers have measure zero in the real line. However, the set of badly
approximable numbers is uncountable (like the reals), whereas the set of algebraic numbers
is countable (like the integers and the rationals). Therefore, there are uncountably many
badly approximable transcendental numbers, but producing a single speciο¬c example seems
to be non-trivial.
On a more basic level, much is still unknown about the irrationality and transcendenceβof
familiar numbers. For instance, it is not known whether the numbers π ± π, π/π, π π , π 2 ,
2π , and log π are irrational. As another example, consider the Riemann zeta function
β
β
1
π(π ) =
ππ
π=1
52
SCOTT T. PARSELL
for π > 1. When π is an even integer, it is known that π(π ) is a rational multiple of π π
(and hence transcendental); for example, π(2) = π 2 /6. Much less is known when π is odd.
It was proved by ApeΜry in 1979 that π(3) is irrational, but it is unknown whether π(3) is
transcendental. It is unknown whether π(5) is irrational, although it has been shown that at
least one of π(5), π(7), π(9), and π(11) must be irrational (Zudilin, 2001). In fact, it is known
that there are inο¬nitely many odd integers π for which π(π ) is irrational (Rivoal, 2000), but
the irrationality is not known for any particular odd π > 3.
9. The distribution of primes
Suppose that ππ denotes the πth prime, so that π1 = 2, π2 = 3, π3 = 5, and so on. One
of the central problems in analytic number theory is to obtain precise information about
the behavior of ππ as π β β. A simple way to get an idea of how the sequence (ππ ) is
distributed is to look at the sum of the reciprocals of the terms:
β
β
1
1 1 1 1
1
1
1
= + + + +
+
+
+ ....
π
2 3 5 7 11 13 17
π=1 π
(9.1)
As a starting point, we recall from calculus that the harmonic series 1 + 12 + 13 + 41 + . . .
diverges. The following lemma provides quantitative information about the growth rate of
the partial sums. Speciο¬cally, it says that the sum of the ο¬rst π terms is roughly log π .
Lemma 9.1. For all positive integers π , one has
1 1 1
1
0 < 1 + + + + β
β
β
+
β log π β€ 1.
2 3 4
π
Proof. By using a left-hand Riemann sum to over-estimate the area under the graph of
π¦ = 1/π‘, we ο¬nd that
β« π +1
1 1
1
ππ‘
1 + + + β
β
β
+
>
= log(π + 1) > log π.
2 3
π
π‘
1
By considering a right-hand Riemann sum we similarly obtain
β« π
1 1
1
ππ‘
1 + + + β
β
β
+
β€1+
= 1 + log π,
2 3
π
π‘
1
and the result follows by subtracting log π from each side of the above inequalities.
β‘
It turns out that the quantity considered above actually approaches a limit as π β β,
known as Eulerβs constant:
)
(
1
1 1
πΎ = lim 1 + + + β
β
β
+
β log π = 0.57721566490153286060651209008240243 . . .
π ββ
2 3
π
Although Eulerβs constant is known to over 1,000,000 decimal places, it is still unknown
whether it is irrational. It is conjectured to be transcendental.
We now return to the prime harmonic series (9.1). It turns out that the π th partial sum
of this series is on the order of log log π rather than log π . In what follows, it is useful to
call an integer square-free if it is not divisible by the square of any prime. In other words, π
is square-free if we can write π = π1 β
β
β
ππ , where π1 , . . . , ππ are distinct primes. For instance,
MA 311
NUMBER THEORY
FALL 2008
53
the integer 42 = 2 β
3 β
7 is square-free but 45 = 32 β
5 is not. From now on, the letter π is
reserved to denote a prime unless otherwise indicated.
β1
Theorem 9.2. For every integer π > 1, one has
> log log π β log 2.
π
πβ€π
Proof. Every positive integer π can be written uniquely in the form π = ππ2 , where π and
π are positive integers with π square-free. Using Lemma 9.1, we obtain
β
β1
β
β
β 1 β
1
1
log π <
=
β€
.
2
π
ππ
π π=1 π2
β
πβ€π
πβ€π
πβ€π
π squarefree
Furthermore, we have
πβ€
π/π
π squarefree
β« β
β
β
1
ππ‘
<1+
= 2,
2
π
π‘2
1
π=1
and the inequality 1 + π’ β€ ππ’ yields
) β
(β )
β 1
β(
1
1
1/π
β€
1+
β€
π = exp
.
π πβ€π
π
π
πβ€π
πβ€π
πβ€π
π squarefree
We therefore deduce that
(β )
1
log π < 2 exp
,
π
πβ€π
and taking logarithms gives
log log π < log 2 +
β1
,
π
πβ€π
as required.
Corollary 9.3. The prime harmonic series
β‘
β
β
1
diverges.
π
π
π=1
Proof. This follows immediately from Theorem 9.2, since lim (log log π ) = β.
π ββ
β‘
β
We may interpret the divergence of πβ1
π to mean that the primes are not all that sparsely
distributed. For instance, if the primes were as sparse
as the sequence of perfect squares
β β2
then the series would converge by comparison with
π . On the other hand, comparing
the orders of growth of the partial sums in Lemma 9.1 and Theorem 9.2 indicates that the
primes are, at least in some sense, signiο¬cantly sparser than the integers themselves.
In order to obtain more precise information about the growth of ππ , it is useful to deο¬ne
π(π) to be the number of primes π with π β€ π. We aim to derive some elementary bounds
for π(π) due to Chebyshev and then use our results to obtain bounds on ππ . We begin with
two simple combinatorial lemmas.
( )
2π
π
Lemma 9.4. One has 2 β€
< 4π for all positive integers π.
π
54
SCOTT T. PARSELL
Proof. By the binomial theorem, we have
π
4 = (1 + 1)
2π
)
2π (
β
2π
=
π=0
π
( )
2π
>
.
π
The other inequality may be established by a simple induction argument and is left as an
exercise.
β‘
Lemma 9.5. One has
β
π πββ
β βlog
β
π
log π! =
log π.
ππ
πβ€π π=1
3 β
β
β
(π β 1) β
π, it is clear that there are no primes π > π dividing π!,
Proof. Since π! = 1 β
2 β
β
so we may write π! = πβ€π ππΌπ , where πΌπ is a non-negative integer representing the exact
power of π that divides π!. Taking logarithms gives
β
log π! =
πΌπ log π,
πβ€π
so it remains to ο¬nd a formula for πΌπ . Among the integers 1, 2, 3, . . . , π β 1, π, there are
βπ/πβ multiples of π. Of these, βπ/π2 β are also multiples of π2 , and in general βπ/ππ β of
them are multiples of ππ . Since ππ > π when π > logπ π, we see that
β β β β β β
π
π
π
πΌπ =
+ 2 + 3 + ... =
π
π
π
βlogπ πββ
β
π=1
β
π
,
ππ
and this completes the proof.
β‘
Theorem 9.6. For every integer π β₯ 2 one has
π
6π
< π(π) <
.
6 log π
log π
Proof. By taking logarithms in the result of Lemma 9.4, we obtain
since
(2π)
π
π log 2 β€ log(2π)! β 2 log π! < π log 4,
= (2π)!/(π!)2 . Lemma 9.5 therefore gives
π log 2 β€
β
π 2πβ (β
β βlog
β
2π
πβ€2π
π=1
ππ
β β)
π
β2 π
log π < π log 4.
π
(9.2)
Now since β2π₯β β 2βπ₯β is 0 if 0 β€ π₯ β βπ₯β < 1/2 and 1 otherwise, we ο¬nd that
β
π
< π log 2 β€
(logπ 2π)(log π) = π(2π) log 2π,
2
πβ€2π
and hence π(2π) > 2π/(4 log 2π), which establishes the lower bound for even integers. Since
2π β₯ 23 (2π + 1) for π β₯ 1, we also have
π(2π + 1) β₯ π(2π) >
2π + 1
2π
>
,
4 log 2π
6 log(2π + 1)
MA 311
NUMBER THEORY
FALL 2008
55
which proves the lower bound for odd integers. For the upper bound, we delete all but the
π = 1 term from (9.2) to obtain
β β)
β (β 2π β
π
β2
log π < π log 4.
π
π
πβ€2π
Let π(π) =
β
log π. Since β2π/πβ β 2βπ/πβ = 1 when π < π β€ 2π, we deduce that
πβ€π
β
π(2π) β π(π) =
log π < π log 4.
π<πβ€2π
Now if π is a particular integer with π β₯ 2, there is a positive integer π such that 2π β€ π <
2π+1 . We then have
π+1
π(π) β€ π(2
)=
π
β
(π(2
π+1
π
) β π(2 )) <
π=0
π
β
2π log 4 = (2π+1 β 1) log 4 < 4π log 2,
π=0
since the ο¬rst summation telescopes and π(1) = 0. On the other hand, we have
β
π(π) β₯
log π β₯ (π(π) β π(π2/3 )) log(π2/3 ) β₯ 32 (π(π) β π2/3 ) log π.
π2/3 <πβ€π
Combining the previous two inequalities yields (π(π) β π2/3 ) log π < 6π log 2, and hence
(
)
6π log 2
π
log π
2/3
π(π) <
+π =
6 log 2 + 1/3 .
log π
log π
π
It is a simple calculus exercise to show that the function (log π₯)/π₯1/3 takes its maximum
value at π₯ = π3 and hence that (log π)/π1/3 β€ 3/π for all π β₯ 1. We therefore have
π(π) <
π
6π
(6 log 2 + 3/π) <
,
log π
log π
as required.
β‘
We now deduce upper and lower bounds on the size of the πth prime.
Theorem 9.7. For every integer π β₯ 2, one has
1
π log π < ππ < 18π log π.
6
Proof. Suppose that ππ = π. By Theorem 9.6, we have
π = π(π) <
6π
6ππ
=
,
log π
log ππ
and thus ππ > 16 π log ππ > 61 π log π, which gives the lower bound. Similarly, Theorem 9.6
gives
π
ππ
π = π(π) >
=
,
6 log π
6 log ππ
56
SCOTT T. PARSELL
and thus ππ < 6π log ππ . We recall from the proof of Theorem 9.6 that log π₯ β€ (3/π)π₯1/3 ,
1/3
2/3
which gives ππ < (18/π)πππ and thus ππ < 18π/π. Taking logarithms gives
2
log ππ < log π + log(18/π) < 2 log π,
3
provided that π > 6. We therefore obtain ππ < 18π log π when π > 6, and it is easy to check
that this holds for 2 β€ π β€ 6 as well.
β‘
Even more precise information is known about π(π) and ππ asymptotically as π β β.
Before mentioning some of these results, we discuss some of the common asymptotic notation.
We say that π (π₯) βΌ π(π₯) as π₯ β β if
π (π₯)
= 1.
π₯ββ π(π₯)
lim
Furthermore, we write π (π₯) = π(π(π₯)) if
lim
π₯ββ
π (π₯)
= 0.
π(π₯)
Finally, we write π (π₯) = π(π(π₯)) if there is a constant π such that β£π (π₯)β£ β€ π β£π(π₯)β£ for all
π₯. Notice that π = π(π) implies that π = π(π).
Theorem 9.8. (The Prime Number Theorem) As π β β one has
π
π(π) βΌ
and
ππ βΌ π log π.
log π
The proof of the prime number theorem is beyond the scope of the course, as the most
direct method requires the theory of complex variables. If π(π; π, π) denotes the number of
primes π β€ π with π β‘ π (mod π), then it is also known that
1
π
π(π; π, π) βΌ
π(π) βΌ
π(π)
π(π) log π
whenever (π, π) = 1. This is called the prime number theorem for arithmetic progressions. In
particular, it shows that there are inο¬nitely many primes in each reduced residue class modulo
π and that the primes are equally distributed among the residue classes asymptotically. For
example, roughly half of the odd primes are congruent to 1 mod 4 and roughly half are
congruent to 3 mod 4.
The prime number theorem may be interpreted by saying that the probability that the
integer π is prime is roughly 1/ log π. In fact, this interpretation leads to an approximation
for π(π) that is more accurate than π/ log π. It is known that π(π) βΌ li(π), where
β« π₯
ππ‘
li(π₯) =
.
2 log π‘
We may think of li(π₯) as a sort of cumulative distribution function for the density function
π (π‘) = 1/ log π‘. It is known that β£π(π₯)βli(π₯)β£ = π(π₯) as π₯ β β, and in fact one can make the
error term more explicit. The best known quantitative version of the prime number theorem
states that
β
β£π(π₯) β li(π₯)β£ = π(π₯ exp(βπ log π₯))
MA 311
NUMBER THEORY
FALL 2008
57
for some constant π > 0. However, it is easy to show that this error term grows more
rapidly than π₯1βπΏ for every πΏ > 0, so this is actually a fairly weak
β result in some sense. It is
conjectured that the true error term is just slightly larger than π₯.
Conjecture 9.9. (The Riemann Hypothesis) One has
β
β£π(π₯) β li(π₯)β£ = π( π₯ log π₯).
This is one of the most notorious unsolved problems in mathematics, and even establishing
an error term of π(π₯1βπΏ ) for some positive πΏ would be considered a major breakthrough. The
usual statement of the Riemann hypothesis concerns the zeta function mentioned at the end
of §8. This is a function of a complex variable, which is deο¬ned by the inο¬nite series
β
β
π(π ) =
πβπ
π=1
when Re(π ) > 1. The above series fails to converge when Re(π ) β€ 1, but it turns out that the
zeta function has a unique extension (called an analytic continuation) to the whole complex
plane. This extension of π(π ) has so-called βtrivialβ zeros at the negative even integers, and
the Riemann hypothesis is equivalent to the assertion that all the remaining zeros of π(π ) lie
on the line Re(π ) = 1/2.
Twin Primes and Mersenne Primes. It is conjectured that π2 (π), the number of twin
prime pairs (π, π + 2) with π + 2 β€ π, is asymptotic to πΆπ/(log π)2 for some constant πΆ > 0,
but we donβt even know that π2 (π) β β. This latter statement is known as the Twin Prime
Conjecture. In some sense, the twin primes are very sparse, as it can be shown that the sum
of their reciprocals,
(
) (
) (
) (
)
1 1
1 1
1
1
1
1
+
+
+
+
+
+
+
+ ...
3 5
5 7
11 13
17 19
converges, in contrast to the conclusion of Corollary 9.3. The value of the above sum, known
as Brunβs constant, is quite diο¬cult to estimate precisely because of the slow convergence;
however, its value appears to be around 1.902160583. In 1994, Nicely discovered inconsistencies in his computations of Brunβs constant, which turned out to result from a subtle ο¬aw
in Intelβs new Pentium processor. This led to an embarrassing recall and provided one of
the more surprising applications of number theory. It is not known whether Brunβs constant
is rational; of course, its irrationality would imply the Twin Prime Conjecture since a ο¬nite
sum of rational numbers is rational.
Recall that the Mersenne numbers are integers of the form 2π β 1 where π is prime. It
is conjectured that the number of Mersenne primes up to π is asymptotic to ππΎ log2 (log π),
where πΎ is Eulerβs constant. However, only 46 Mersenne primes have been discovered as
of November 2008, and proving that there are inο¬nitely many seems completely out of
reach. The computational evidence certainly suggests that the Mersenne primes are sparsely
distributed among the Mersenne numbers; that is, for most primes π the number 2π β 1 turns
out to be composite. However, it also remains an open problem to establish that there are
inο¬nitely many composite Mersenne numbers. It seems inconceivable that this would fail,
since then all suο¬ciently large Mersenne numbers would be prime! Nevertheless, the existing
technology does not seem to be capable of generating a proof.
58
SCOTT T. PARSELL
References
[1] G. E. Andrews, Number Theory, Dover, 1994.
[2] T. M. Apostol, Introduction to analytic number theory, Undergraduate Texts in Mathematics, Springer-Verlag, 1976.
[3] T. H. Barr, Invitation to cryptology, Prentice Hall, 2002.
[4] D. M. Bressoud, Factorization and primality testing, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1989.
[5] E. B. Burger, Exploring the number theory jungle: A journey into diophantine analysis,
AMS Student Mathematical Library, Volume 8, 2000.
[6] M. Erickson and A. Vazzana, Introduction to number theory, Discrete Mathematics and
its Applications, Chapman & Hall/CRC, Boca Raton, 2008.
[7] J. A. Gallian, Contemporary abstract algebra, 6th ed, Houghton Miο¬in, 2006.
[8] G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 6th ed,
Oxford University Press, 2008.
[9] J. F. Humphreys and M. Y. Prest, Numbers, groups, and codes, Cambridge University
Press, 1989.
[10] K. Ireland and M. Rosen, A classical introduction to modern number theory, 2nd ed,
Graduate Texts in Mathematics, 84, Springer-Verlag, 1990.
[11] N. Koblitz, A course in number theory and cryptography, 2nd ed, Graduate Texts in
Mathematics, 114, Springer-Verlag, 1994.
[12] H. L. Montgomery and R. C. Vaughan, Multiplicative number theory I. Classical
theory, Cambridge University Press, 2007.
[13] M. B. Nathanson, Additive number theory I: The classical bases, Graduate Texts in
Mathematics, 164, Springer-Verlag, 1996.
[14] I. Niven, H. S. Zuckerman, and H. L. Montgomery, An introduction to the theory of
numbers, 5th ed, Wiley, 1991.
[15] K. H. Rosen, Elementary number theory and its applications, 5th ed, Pearson Addison
Wesley, 2005.
[16] J. H. Silverman, A friendly introduction to number theory, 3rd ed, Pearson Prentice
Hall, 2006.
[17] G. Tenenbauam and M. MendeΜs France, The prime numbers and their distribution,
AMS Student Mathematical Library, Volume 6, 2000.
[18] R. C. Vaughan, The Hardy-Littlewood method, 2nd ed, Cambridge University Press,
1997.