Elementary Number Theory - Chen PDF
Elementary Number Theory - Chen PDF
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 1
DIVISION AND FACTORIZATION
1.1. Division
Suppose that a, b ∈ Z and a = 0. Then we say that a divides b, denoted by a | b, if there exists c ∈ Z
such that b = ac. In this case, we also say that a is a divisor of b, or that b is a multiple of a.
THEOREM 1A. Suppose that a ∈ N and b ∈ Z. Then there exist unique q, r ∈ Z such that b = aq + r
and 0 ≤ r < a.
Proof. We shall first of all show the existence of such numbers q, r ∈ Z. Consider the set
S = {b − as ≥ 0 : s ∈ Z}.
Then it is easy to see that S is a non-empty subset of N ∪ {0}. It follows from the Principle of induction
that S has a smallest element. Let r be the smallest element of S, and let q ∈ Z such that b − aq = r.
Clearly r ≥ 0, so it remains to show that r < a. Suppose on the contrary that r ≥ a. Then b − a(q + 1) =
(b − aq) − a = r − a ≥ 0, so that b − a(q + 1) ∈ S. Clearly b − a(q + 1) < r, contradicting that r is the
smallest element of S.
b = aq1 + r1 = aq2 + r2 .
Then
|r1 − r2 | = a|q2 − q1 |.
If q1 = q2 , then it is easy to see that a|q2 − q1 | ≥ a, while |r1 − r2 | < a, a contradiction. It follows that
q1 = q2 , and so r1 = r2 also.
1–2 W W L Chen : Elementary Number Theory
THEOREM 1B. Suppose that a, b ∈ N. Then there exists a unique d ∈ N such that
(i) there exist x, y ∈ Z such that d = ax + by;
(ii) d | a and d | b; and
(iii) for every k ∈ N such that k | a and k | b, we have k | d.
I = {au + bv : u, v ∈ Z}.
Then it is easy to see that I is a non-empty subset of Z which contains some positive integers. It follows
from the Principle of induction that I has a least positive element. Let d be the least positive element
of I, and let x, y ∈ Z such that d = ax + by. The conclusion (i) follows trivially. Also, d is uniquely
defined.
Next, we shall show that d divides every integer in I. Suppose that z = au + bv is any given integer
in I. By Theorem 1A, there exist q, r ∈ Z such that z = dq + r, where 0 ≤ r < d. Then
If r = 0, then the requirement 0 < r < d contradicts the minimality of d. Hence r = 0, so that z = dq,
whence d divides z.
Taking u = 1 and v = 0 gives d | a. Taking u = 0 and v = 1 gives d | b. Also, the conclusion (iii) is
a simple consequence of (i).
The number d in Theorem 1B is called the greatest common divisor of a and b, and denoted by
d = (a, b). Two numbers a, b ∈ N are said to be relatively prime, or coprime, if (a, b) = 1.
A practical way of finding the greatest common divisor of two natural numbers is given by the
following result.
THEOREM 1C. (EUCLID’S ALGORITHM) Suppose that a, b ∈ N, and a < b. Suppose further
that q1 , . . . , qn+1 ∈ Z and r1 , . . . , rn ∈ N satisfy 0 < rn < rn−1 < . . . < r1 < a and
b = aq1 + r1 ,
a = r1 q 2 + r2 ,
r1 = r2 q 3 + r3 ,
..
.
rn−2 = rn−1 qn + rn ,
rn−1 = rn qn+1 .
Then (a, b) = rn .
(a, b) | (a, r1 ).
Example. Consider (589, 5111). In our notation, we let a = 589 and b = 5111. Then we have
19 = 399 − 190 · 2
= 399 − (589 − 399 · 1) · 2
= 589 · (−2) + 399 · 3
= 589 · (−2) + (5111 − 589 · 8) · 3
= 5111 · 3 + 589 · (−26).
THEOREM 1D. Suppose that a, b ∈ N and (a, b) = 1. Suppose further that w ∈ N satisfies w | ab.
Then there exist unique u, v ∈ N such that u | a, v | b and w = uv.
Proof. We shall first of all show that u = (w, a) and v = (w, b) satisfy the requirements. Consider
the number (w, a)(w, b). By Theorem 1B, there exist x1 , y1 , x2 , y2 ∈ Z such that (w, a) = wx1 + ay1 and
(w, b) = wx2 + by2 , so that
(w, a)(w, b) = (wx1 + ay1 )(wx2 + by2 ) = w(wx1 x2 + ay1 x2 + bx1 y2 ) + aby1 y2 .
It follows that
On the other hand, since (a, b) = 1, it follows from Theorem 1B that there exist x, y ∈ Z such that
ax + by = 1, so that w = wax + wby. Note now that (w, a) | a and (w, b) | w, so that (w, a)(w, b) | wax.
Note also that (w, a) | w and (w, b) | b, so that (w, a)(w, b) | wby. It follows that
Combining (4) and (5), and noting that w, (w, a), (w, b) ∈ N, we conclude that w = (w, a)(w, b).
1.2. Factorization
Suppose that a ∈ N and a > 1. Then we say that a is prime if it has exactly two positive divisors,
namely 1 and a. We also say that a is composite if it is not prime. It is convenient to treat the integer
1 as neither prime nor composite. To find a good reason for not including 1 as a prime, see the Remark
following Theorem 1G.
Throughout these notes, the symbol p, with or without suffices, denotes a (positive) prime.
Proof. Suppose that p a. Since p is prime, the only positive divisors of p are 1 and p. Hence we
must have (a, p) = 1. It follows from Theorem 1B that there exist x, y ∈ Z such that 1 = ax + py, so
that b = abx + pby. Clearly p | b.
Remark. If the integer 1 were included as a prime, then we would have to rephrase the statement of
the Fundamental theorem of arithmetic to allow for different representations like 6 = 2 · 3 = 1 · 2 · 3.
Proof of Theorem 1G. We shall first of all show by induction that every integer n ≥ 2 is repre-
sentable as a product of primes. Clearly 2 is a product of primes. Assume now that n > 2 and that
every m ∈ N satisfying 2 ≤ m < n is representable as a product of primes. If n is a prime, then it is
obviously representable as a product of primes. If n is not a prime, then there exist n1 , n2 ∈ N satisfying
2 ≤ n1 < n and 2 ≤ n2 < n such that n = n1 n2 . By our induction hypothesis, both n1 and n2 are
representable as products of primes, so that n must also be representable as a product of primes.
where p1 ≤ . . . ≤ pr and p1 ≤ . . . ≤ ps are primes. Now p1 | p1 . . . ps , so it follows from Theorem 1F
that p1 | pj for some j = 1, . . . , s. Since p1 and pj are both primes, we must then have p1 = pj . On the
other hand, p1 | p1 . . . pr , so again it follows from Theorem 1F that p1 | pi for some i = 1, . . . , r, so again
we must have p1 = pi . It now follows that p1 = pj ≥ p1 = pi ≥ p1 , so that p1 = p1 . Removing the factor
p1 = p1 from (1), we obtain
p2 . . . pr = p2 . . . ps .
Repeating this argument a finite number of times, we conclude that r = s and pi = pi for every
i = 1, . . . , r.
THEOREM 1H. Every natural number n > 1 is representable uniquely in the form
(7) n = pm mr
1 . . . pr ,
1
The representation (7) is called the canonical decomposition of the natural number n.
Chapter 1 : Division and Factorization 1–5
There are many consequences of the Fundamental theorem of arithmetic. The following is one which
concerns primes.
Proof. Suppose on the contrary that p1 < . . . < pr are all the primes. Let
n = p1 . . . pr + 1.
Then n ∈ N and n > 1. It follows from the Fundamental theorem of arithmetic that pj | n for some
j = 1, . . . , r, so that pj | (n − p1 . . . pr ) = 1, a contradiction.
Let p be a prime. For any given n ∈ N, it is an interesting problem to find the largest integer k
such that pk | n!. In order to describe the answer to this question, we need to define one of the most
useful functions in number theory.
Suppose that α ∈ R. The number [α] ∈ Z is defined to be the unique integer m ∈ Z satisfying
m ≤ α < m + 1. We call [α] the integer part of α.
The integer part function has many interesting properties. The proof of the following results is left
as an exercise.
(ii) If α ≥ 0, then [α] counts the number of natural numbers not exceeding α. In other words,
[α] = 1.
1≤n≤α
(vi) The number −[−α] is the smallest integer not less than α.
(viii) The number [α+1/2] is one of the two nearest integers to α. Furthermore, if these two integers
both differ from α by the same value, then [α + 1/2] is the larger of these two integers.
(ix) If α > 0 and n ∈ N, then [α/n] is the number of positive integers not exceeding α and which
are multiples of n.
THEOREM 1K. Suppose that n ∈ N and p is a prime. Then the largest integer k such that pk | n!
is given by
∞
n
k= .
j=1
pj
1–6 W W L Chen : Elementary Number Theory
∞
n ∞
n ∞
n
k= 1= 1= ,
m=1 j=1 j=1 m=1 j=1
pj
pj |m pj |m
Given that there are infinitely many primes, a natural question that arises is to determine the number
π(X) of primes that do not exceed a given real number X. This was the subject of much investigation
in the 1800’s. For example, Legendre proposed in 1808 that there is a constant A such that for large
values of X, the number π(X) can be approximated by
X
(8) .
log X − A
Gauss proposed the function
1
log x
as an approximation to the average density of distribution of primes near any large real number x, and
thus formulated the function
X
dx
(9)
2 log x
π(X) log X
(11) lim = 1.
X→∞ X
Indeed, Tchebycheff showed in 1848 that if the limit in (11) exists at all, then it must be equal to 1.
Unfortunately, he and others were unable to show that the limit exists. Then in 1850, he showed that
there exist positive constants c1 and c2 such that for every real number X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X
Chapter 1 : Division and Factorization 1–7
This confirms that the function (10) at least represents the correct order of magnitude of π(X). We
shall prove Tchebycheff’s theorem in Chapter 6.
The crucial idea that finally led to the proof of (11) was introduced by Riemann in a monumental
contribution in 1860. Riemann observed that the series
∞
1
(12) s
n=1
n
plays a crucial role in the study of the distribution of primes if one treats s as a complex variable. It
follows that the distribution of primes can be studied by the use of methods in the theory of analytic
functions. Riemann denoted the series (12) by ζ(s), and the function has since been known as the
Riemann zeta function. Indeed, Riemann’s work has also influenced greatly the development of the
general theory of functions.
Riemann’s ideas were studied in great depth in the late 1800’s by von Mangoldt and Hadamard.
This culminated in the proof of (11) in 1896 by Hadamard and de la Vallée Poussin, independently and
almost simultaneously. In particular, the work of de la Vallée Poussin showed that the integral (9) is a
better approximation to π(X) than the function (8) for any value of the constant A.
4. Prove that the three natural numbers n, n + 2, n + 4 cannot be simultaneously prime unless n = 3.
7. Prove that for every natural number n > 2, at least one of 2n − 1 and 2n + 1 is composite.
8. Suppose that a, b, c ∈ N.
(i) Prove that if 3 | (a2 + b2 ), then 3 | ab.
(ii) Prove that if 9 | (a3 + b3 + c3 ), then 3 | abc.
9. Suppose that m, n ∈ N.
(i) Prove that n! | (m + 1)(m + 2) . . . (m + n).
(ii) Prove that 6 | (n3 − n) and 120 | (n5 − 5n3 + 4n).
(3m)!(4n)!
(iii) Prove that ∈ N.
(m!)3 (n!)4
14. A rational number a/b with (a, b) = 1 is called a reduced fraction. If a sum of two reduced fractions
is an integer, say (a/b) + (c/d) ∈ Z, prove that |b| = |d|.
15. Suppose that a, b, c, d ∈ N. Prove each of the following without using prime factorizations:
(i) If a | bc and (a, b) = 1, then a | c.
(ii) (a, b) = d if and only if (a/d, b/d) = 1.
(iii) (ac, bc) = c(a, b).
(iv) (a, bc) = (a, (a, b)c).
(v) (a2 , b2 ) = (a, b)2 .
20. Find (589, 5111). Find also integers x and y such that (589, 5111) = 589x + 5111y. Hence give the
general solution of this equation in integers x and y.
21. Prove that there are infinitely many primes of the form 4n − 1.
m+n n−m+1
22. Prove that + = n for every m, n ∈ Z.
2 2
by considering the open rectangle with vertices at (0, 0), (a, 0), (0, b) and (a, b), split into two halves
by the line segment ay = bx where 0 < x < a. Show first that there are no lattice points (m, n) ∈ N2
on the line segment between the endpoints (0, 0) and (a, b). Then count lattice points (m, n) ∈ N2
in one of the two open triangular regions.
Chapter 1 : Division and Factorization 1–9
24. Suppose that n ∈ N, and that α ∈ R is non-negative. Prove Hermite’s identity, that
n−1
k
α+ = [nα].
n
k=0
27. With how many zeros does the decimal digit expansion of 2003! end?
200
28. Find the largest two-digit prime factor of .
100
√ 2n
29. Suppose n ∈ N, and that p ≥ 2n is a prime such that p divides . Prove that p2 does not
n
2n
divide .
n
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 2
ARITHMETIC FUNCTIONS
2.1. Introduction
THEOREM 2A. Suppose that the function f : N → C is multiplicative. Then the function g : N → C,
defined by
g(n) = f (m)
m|n
Proof of Theorem 2A. Suppose that a, b ∈ N and (a, b) = 1. If u is a positive divisor of a and v
is a positive divisor of b, then clearly uv is a positive divisor of ab. On the other hand, by Theorem
1D, every positive divisor m of ab can be expressed uniquely in the form m = uv, where u is a positive
divisor of a and v is a positive divisor of b. Clearly (u, v) = 1. It follows that
g(ab) = f (m) = f (uv) = f (u)f (v) = f (u) f (v) = g(a)g(b).
m|ab u|a v|b u|a v|b u|a v|b
for every n ∈ N. Here the sum is taken over all positive divisors m of n. In other words, the value
d(n) denotes the number of positive divisors of the natural number n. On the other hand, we define the
function σ : N → C by writing
(2) σ(n) = m
m|n
for every n ∈ N. Clearly, the value σ(n) denotes the sum of all the positive divisors of the natural
number n.
THEOREM 2B. Suppose that n ∈ N and that n = pu1 1 . . . pur r is the canonical decomposition of n.
Then
pu1 1 +1 − 1 pur +1 − 1
d(n) = (1 + u1 ) . . . (1 + ur ) and σ(n) = ... r .
p1 − 1 pr − 1
Proof. Every positive divisor m of n is of the form m = pv11 . . . pvrr , where for every j = 1, . . . , r, the
integer vj satisfies 0 ≤ vj ≤ uj . It follows from (1) that d(n) is the number of choices for the r-tuple
(v1 , . . . , vr ). Hence
u1
ur
d(n) = ... 1 = (1 + u1 ) . . . (1 + ur ).
v1 =0 vr =0
uj
v u
u +1
pj j − 1
pj j = 1 + pj + p2j + . . . + pj j = .
vj =0
pj − 1
Natural numbers n ∈ N where σ(n) = 2n are of particular interest, and are known as perfect
numbers. A perfect number is therefore a natural number which is equal to the sum of its own proper
divisors; in other words, the sum of all its positive divisors other than itself.
It is not known whether any odd perfect number exists. However, we can classify the even perfect
numbers.
Chapter 2 : Arithmetic Functions 2–3
(2m−1 , 2m − 1) = 1.
Suppose now that n ∈ N is an even perfect number. Then we can write n = 2m−1 u, where m ∈ N
and m > 1, and where u ∈ N is odd. By Theorem 2B, we have
so that
2m u u
(3) σ(u) = =u+ m .
2m −1 2 −1
Note that σ(u) and u are integers and σ(u) > u. Hence u/(2m − 1) ∈ N and is a divisor of u. Since
m > 1, we have 2m − 1 > 1, and so u/(2m − 1) = u. It now follows from (3) that σ(u) is equal to the
sum of two of its positive divisors. But σ(u) is equal to the sum of all its positive divisors. Hence u
must have exactly two positive divisors, so that u is prime. Furthermore, we must have u/(2m − 1) = 1,
so that u = 2m − 1.
We are interested in the behaviour of d(n) and σ(n) as n → ∞. If n ∈ N is a prime, then clearly
d(n) = 2. Also, the magnitude of d(n) is sometimes greater than that of any power of log n. More
precisely, we have the following result.
c
THEOREM 2E. For any fixed real number c > 0, the inequality d(n) (log n) as n → ∞ does not
hold.
Proof. The idea of the proof is to consider integers which are divisible by many different primes.
Suppose that c > 0 is given and fixed. Let ∈ N ∪ {0} satisfy ≤ c < + 1. For every j = 1, 2, 3, . . . ,
let pj denote the j-th positive prime in increasing order of magnitude, and consider the integer
m
n = (p1 . . . p+1 ) .
depends only on c. The result follows on noting that the inequality (4) holds for every m ∈ N.
On the other hand, the order of magnitude of d(n) cannot be too large either.
2–4 W W L Chen : Elementary Number Theory
THEOREM 2F. For any fixed real number > 0, we have d(n) n as n → ∞.
Proof. For every natural number n > 1, let n = pu1 1 . . . pur r be its canonical decomposition. It follows
from Theorem 2B that
d(n) (1 + u1 ) (1 + ur )
= ... .
n pu
1
1
pu
r
r
We may assume without loss of generality that < 1. If 2 ≤ pj < 21/ , then
uj
pj ≥ 2uj = euj log 2 > 1 + uj log 2 > (1 + uj ) log 2,
so that
(1 + uj ) 1
uj < .
pj log 2
(1 + uj ) 1 + uj
u ≤ ≤ 1.
pj j 2u j
It follows that
d(n) 1
< ,
n 1/
log 2
p<2
We see from Theorems 2E and 2F and the fact that d(n) = 2 infinitely often that the magnitude of
d(n) fluctuates a great deal as n → ∞. It may then be more fruitful to average the function d(n) over a
range of values of n, and consider, for positive real numbers X ∈ R, the value of the average
1
d(n).
X
n≤X
1
= log Y + γ + O .
n Y
n≤Y
Proof. As Y → ∞, we have
1 Y Y Y
1 1 [Y ] 1 [Y ] 1
= + du +
= du = + 1 du
n Y n u2 Y u 2 Y 1 u2
n≤Y n≤Y n≤Y n n≤u
Y Y Y
[Y ] [u] [Y ] 1 u − [u]
= + 2
du = + du − du
Y u Y u u2
1
∞1 1
∞
1 u − [u] u − [u]
= log Y + 1 + O − 2
du + du
Y 1 u Y u2
∞
u − [u] 1
= log Y + 1 − 2
du + O .
1 u Y
X 2 X 2
=2 − [X 1/2 ] = 2 + O(X 1/2 ) − (X 1/2 + O(1))
x x
x≤X 1/2 x≤X 1/2
1
= 2X log X 1/2
+γ+O + O(X 1/2 ) − X
X 1/2
= X log X + (2γ − 1)X + O(X 1/2 ).
We next turn our attention to the study of the behaviour of σ(n) as n → ∞. Every number n ∈ N
has divisors 1 and n, so we must have σ(1) = 1 and σ(n) > n if n > 1. On the other hand, it follows
from Theorem 2F that for any fixed real number > 0, we have
Proof. As n → ∞, we have
n 1
σ(n) = ≤n n log n.
m m
m|n m≤n
As in the case of d(n), the magnitude of σ(n) fluctuates a great deal as n → ∞. As before, we shall
average the function σ(n) over a range of values of n, and consider some average value of the function.
Corresponding to Theorem 2G, we have the following result.
π2 2
σ(n) = X + O(X log X).
12
n≤X
Proof. As X → ∞, we have
n n 1 X
X
σ(n) = = = r= 1+
m m 2 m m
n≤X n≤X m|n m≤X n≤X m≤X r≤X/m m≤X
m|n
2 2
1 X X 1 1
= + O(1) = + O X +O 1
2 m 2 m2 m
m≤X m≤X m≤X m≤X
∞
X2 1 2
1 π2 2
= 2
+O X 2
+ O(X log X) = X + O(X log X).
2 m=1 m m 12
m>X
Remarks. (i) A natural number which is not divisible by the square of any prime is called a squarefree
number. Note that 1 is both a square and a squarefree number. Furthermore, a number n ∈ N is
squarefree if and only if µ(n) = ±1.
(ii) The motivation for the definition of the Möbius function lies rather deep. To understand the
definition, one needs to study the Riemann zeta function, an important function in the study of the
distribution of primes. At this point, it suffices to remark that the Möbius function is defined so that if
we formally multiply the two series
∞ ∞
1 µ(n)
s
and ,
n=1
n n=1
ns
where s ∈ C denotes a complex variable, then the product is identically equal to 1. Heuristically, note
that
∞ ∞
1 µ(m) ∞ ∞ ∞ ∞
µ(m) 1
= = µ(m) s .
ks m=1
m s
n=1 m=1
n s
n=1
n
k=1 k=1 m|n
km=n
Chapter 2 : Arithmetic Functions 2–7
We shall establish this last fact and study some of its consequences over the next four theorems.
Proof. Suppose that a, b ∈ N and (a, b) = 1. If a or b is not squarefree, then neither is ab, and so
µ(ab) = 0 = µ(a)µ(b). On the other hand, if both a and b are squarefree, then since (a, b) = 1, ab must
also be squarefree. Furthermore, the number of prime factors of ab must be the sum of the numbers of
prime factors of a and of b.
for every n ∈ N. It follows from Theorems 2A and 2L that f is multiplicative. For n = 1, the result is
trivial. To complete the proof, it therefore suffices to show that f (pk ) = 0 for every prime p and every
k ∈ N. Indeed,
f (pk ) = µ(m) = µ(1) + µ(p) + µ(p2 ) + . . . + µ(pk ) = 1 − 1 + 0 + . . . + 0 = 0.
m|pk
Theorem 2M plays the central role in the proof of the following two results which are similar in
nature.
THEOREM 2N. (MÖBIUS INVERSION FORMULA) For any function f : N → C, if the function
g : N → C is defined by writing
g(n) = f (m)
m|n
Remark. In number theory, it occurs quite often that in the proof of a theorem, a change of order of
summation of the variables is required, as illustrated in the proofs of Theorems 2N and 2P. This process
of changing the order of summation does not depend on the summand in question. In both instances,
we are concerned with a sum of the form
A(k, m).
n
m|n k| m
This means that for every positive divisor m of n, we first sum the function A over all positive divisors
k of n/m to obtain the sum
A(k, m),
n
k| m
which is a function of m. We then sum this sum over all divisors m of n. Now observe that for every
natural number k satisfying k | n/m for some positive divisor m of n, we must have k | n. Consider
therefore a particular natural number k satisfying k | n. We must find all natural numbers m satisfying
the original summation conditions, namely m | n and k | n/m. These are precisely those natural numbers
m satisfying m | n/k. We therefore obtain, for every positive divisor k of n, the sum
A(k, m).
m| n
k
Since we are summing the function A over the same collection of pairs (k, m), and have merely changed
the order of summation, we must have
A(k, m) = A(k, m).
n
m|n k| m k|n m| n
k
Chapter 2 : Arithmetic Functions 2–9
We define the Euler function φ : N → C as follows. For every n ∈ N, we let φ(n) denote the number of
elements in the set {1, 2, . . . , n} which are coprime to n.
Proof. We shall partition the set {1, 2, . . . , n} into d(n) disjoint subsets Bm , where for every positive
divisor m of n,
If x ∈ Bm , let x = mx . Then (mx , n) = m if and only if (x , n/m) = 1. Also 1 ≤ x ≤ n if and only if
1 ≤ x ≤ n/m. Hence
Bm = {x : 1 ≤ x ≤ n/m and (x , n/m) = 1}
has the same number of elements as Bm . Note now that the number of elements of Bm is exactly φ(n/m).
Since every element of the set {1, 2, . . . , n} falls into exactly one of the subsets Bm , we must have
n
n= φ = φ(m).
m
m|n m|n
Applying the Möbius inversion formula to the conclusion of Theorem 2Q, we obtain immediately
the following result.
Proof. Since the Möbius function µ is multiplicative, it follows that the function f : N → C, defined
by f (n) = µ(n)/n for every n ∈ N, is multiplicative. The result now follows from Theorem 2A.
THEOREM 2T. Suppose that n ∈ N and n > 1, with canonical decomposition n = pu1 1 . . . pur r . Then
r
r
1 u −1
φ(n) = n 1− = pj j (pj − 1).
j=1
pj j=1
Proof. The second equality is trivial. On the other hand, for every prime p and every u ∈ N, we have
by Theorem 2R that
We now study the magnitude of φ(n) as n → ∞. Clearly φ(1) = 1 and φ(n) < n if n > 1.
Suppose first of all that n has many different prime factors. Then n must have many different
divisors, and so σ(n) must be large relative to n. But then many of the numbers 1, . . . , n cannot be
coprime to n, and so φ(n) must be small relative to n. On the other hand, suppose that n has very few
prime factors. Then n must have very few divisors, and so σ(n) must be small relative to n. But then
many of the numbers 1, . . . , n are coprime to n, and so φ(n) must be large relative to n. It therefore
appears that if one of the two values σ(n) and φ(n) is large relative to n, then the other must be small
relative to n. Indeed, our heuristics are upheld by the following result.
1 σ(n)φ(n)
< ≤ 1.
2 n2
Proof. The result is obvious if n = 1, so suppose that n > 1. Let n = pu1 1 . . . pur r be the canonical
decomposition of n. Recall Theorems 2B and 2T. We have
−u −1
r u +1
pj j − 1 r
1 − pj j
σ(n) = =n
j=1
pj − 1 j=1
1 − p−1
j
and
r
φ(n) = n (1 − p−1
j ).
j=1
Hence
σ(n)φ(n) r
−u −1
2
= (1 − pj j ).
n j=1
r n
−uj −1 1 n+1 1
(1 − pj )≥ (1 − p−2 ) ≥ 1− 2 = >
j=1 m=2
m 2n 2
p|n
as required.
3 2
φ(n) = X + O(X log X).
π2
n≤X
Chapter 2 : Arithmetic Functions 2–11
∞
µ(m) 6
2
= 2.
m=1
m π
But
∞
∞
∞ ∞
1 µ(m) 1 1
2
= µ(m) = µ(m) = 1,
n=1
n m=1
m2 k2 n,m k2
k=1 k=1 m|k
nm=k
We shall denote the class of all arithmetic functions by A, and the class of all multiplicative functions
by M.
It is not difficult to show that Dirichlet convolution of arithmetic functions is commutative and
associative. In other words, for every f, g, h ∈ A, we have
Furthermore, the arithmetic function I : N → C, defined by I(1) = 1 and I(n) = 0 for every n ∈ N
satisfying n > 1, is an identity element for Dirichlet convolution. It is easy to check that I ∗ f = f ∗ I = f
for every f ∈ A.
On the other hand, an inverse may not exist under Dirichlet convolution. Consider, for example,
the function f ∈ A satisfying f (n) = 0 for every n ∈ N.
2–12 W W L Chen : Elementary Number Theory
THEOREM 2X. For any f ∈ A, the following two statements are equivalent:
(i) We have f (1) = 0.
(ii) There exists a unique g ∈ A such that f ∗ g = g ∗ f = I.
Proof. Suppose that (ii) holds. Then f (1)g(1) = 1, so that f (1) = 0. Conversely, suppose that
f (1) = 0. We shall define g ∈ A iteratively by writing
1
(5) g(1) =
f (1)
and
1 n
(6) g(n) = − f (d) g
f (1) d
d|n
d>1
for every n ∈ N satisfying n > 1. It is easy to check that this gives an inverse. Moreover, every inverse
must satisfy (5) and (6), and so the inverse must be unique.
We now describe Theorem 2M and Möbius inversion in terms of Dirichlet convolution. Recall that
the function U ∈ A is defined by U (n) = 1 for all n ∈ N.
THEOREM 2Y.
(i) We have µ ∗ U = I.
(ii) If f ∈ A and g = f ∗ U , then f = g ∗ µ.
(iii) If g ∈ A and f = g ∗ µ, then g = f ∗ U .
Proof. (i) follows from Theorem 2M. To prove (ii), note that
g ∗ µ = (f ∗ U ) ∗ µ = f ∗ (U ∗ µ) = f ∗ I = f.
f ∗ U = (g ∗ µ) ∗ U = g ∗ (µ ∗ U ) = g ∗ I = g.
Remark. Note that if f ∈ M is not identically zero, then f (n) = 0 for some n ∈ N. Clearly we have
f (n) = f (1)f (n), and so f (1) = 1.
Proof of Theorem 2Z. For A , this is now trivial. We now consider M . Clearly I ∈ M . If
f, g ∈ M and (m, n) = 1, then it follows from Theorem 1D that
mn
mn
(f ∗ g)(mn) = f (d) g = f (d1 d2 ) g
d d1 d2
d|mn d1 |m d2 |n
m n
= f (d1 ) g f (d2 ) g = (f ∗ g)(m)(f ∗ g)(n),
d1 d2
d1 |m d2 |n
Chapter 2 : Arithmetic Functions 2–13
g(pk ) = h(pk )
for every n > 1. Then g ∈ M . Furthermore, for every integer n > 1, we have
(f ∗ g)(n) = (f ∗ g)(pk ) = (f ∗ h)(pk ) = I(pk ) = I(n),
pk n pk n pk n
so that g is an inverse of f .
5. Suppose that n ∈ N. Show that the number N of solutions of the equation x2 − y 2 = n in natural
numbers x and y satisfies
d(n) − en if n is an odd number,
2N = 0 if n is twice an odd number,
d(n/4) − en if 4 | n,
8. Prove that every odd perfect number must have at least two distinct prime factors, exactly one of
which has odd exponent.
9. Suppose that a ∈ N satisfy a > 1. Let d run over all the divisors of a that have no more than m
prime divisors. Prove that
≥ 0 if m is even,
µ(d)
≤ 0 if m is odd.
10. Suppose that k ∈ N is even, and the canonical decomposition of a ∈ N is of the form a = p1 p2 . . .√pk ,
where p1 , p2
, . . . , pk are distinct primes. Let d run over all the divisors of a such that 0 < d < a.
Prove that µ(d) = 0.
11. Prove that µ(d) = µ2 (n) for every n ∈ N.
d2 |n
[Hint: Distinguish between the cases when n is squarefree and when n is not squarefree.]
12. By first showing that the function f (n) = (−1)n−1 is multiplicative, evaluate the sum
n
h(n) = (−1)m−1 µ for every n ∈ N.
m
m|n
n
13. Explain why µ(m) σ = n for every n ∈ N.
m
m|n
n
nφ(n)
14. Prove that m= for every n ∈ N.
m=1
2
(m,n)=1
15. Suppose that n ∈ N satisfies φ(n) | n. Prove that n = 2a 3b for some non-negative integers a and b.
16. Suppose that p1 , p2 , . . . , pk ∈ N are distinct primes, and that there are no other primes.
(i) Let a = p1 p2 . . . pk . Explain why we must have φ(a) = 1.
(ii) Obtain a contradiction.
[Remark: This is yet another proof that there are infinitely many primes.]
18. Suppose that n = pu1 1 . . . pur r , where p1 < . . . < pr are primes and u1 , . . . , ur ∈ N.
(i) Write
n
s(n) = m2 .
m=1
(m,n)=1
Prove that
s(d) n(n + 1)(2n + 1)
n2 = .
d2 6
d|n
n
1 1
m2 = φ(n)n2 + (−1)r φ(n)p1 . . . pr .
m=1
3 6
(m,n)=1
19. For every n ∈ N, let Q(n) denote the number of squarefree numbers not exceeding n.
∞
n n
(i) Prove that n − Q(n) ≤ + , and deduce that Q(n) > n/2.
4 m=1 (2m + 1)2
(ii) Hence show that every natural number is a sum of two squarefree numbers.
Chapter 2 : Arithmetic Functions 2–15
21. Suppose that F : R+ → C, where R+ denotes the set of all positive real numbers. For any real
number X ≥ 1, let
X
G(X) = F .
n
n≤X
Prove that
X
F (X) = µ(n) G for every real number X ≥ 1.
n
n≤X
X
F (X) = µ(n) G .
n
n≤X
Prove that
X
G(X) = F for every real number X ≥ 1.
n
n≤X
23. Prove that each of the following identities is valid for every real number X ≥ 1:
X
(i) µ(n) = 1.
n
n≤X
2
1 X 1
(ii) φ(n) = µ(n) + .
2 n 2
n≤X n≤X
φ(n) µ(n) X
(iii) = .
n n n
n≤X n≤X
24. Suppose that the function F : R+ → C satisfies F (X) = 0 whenever 0 < X < 1. For any arithmetic
function α, we define the function α ◦ F : R+ → C by writing
X
(α ◦ F )(X) = α(n) F for every X ∈ R+ .
n
n≤X
(ii) Suppose that the arithmetic function α has inverse α−1 under Dirichlet convolution. Prove
that if
X
G(X) = α(n) F for every real number X ∈ R+ ,
n
n≤X
then
−1 X
F (X) = α (n) G for every real number X ∈ R+ .
n
n≤X
[Hint: Note that the identity function I under Dirichlet convolution satisfies I ◦ F = F .]
[Remark: If α is completely multiplicative, then α−1 (n) = µ(n)α(n) for every n ∈ N by
Problem 20(iii). Hence
X X
G(X) = α(n) F if and only if F (X) = µ(n)α(n) G .
n n
n≤X n≤X
µ2 (m)
25. For every n ∈ N, let f (n) = .
φ(m)
m|n
26. Consider a square lattice consisting of all points (a, b), where a, b ∈ Z. Two lattice points P and Q
are said to be mutually visible if the line segment which joins them contains no lattice points other
than the endpoints P and Q.
(i) Prove that (a, b) and (0, 0) are mutually visible if and only if a and b are relatively prime.
(ii) We shall prove that the set of lattice points visible from the origin has density 6/π 2 . Consider
a large square region on the xy-plane defined by the inequalities |x| ≤ r and |y| ≤ r. Let N (r)
denote the number of lattice points in this square, and let N (r) denote the number of these
which are visible from the origin. The eight lattice points nearest the origin are all visible from
the origin. By symmetry, N (r) is equal to 8 plus 8 times the number of visible points in the
region {(x, y) : 2 ≤ x ≤ r and 1 ≤ y ≤ x}. Prove that
r
N (r) = 8 φ(n).
n=1
N (r) 6
→ 2 as r → ∞.
N (r) π
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 3
CONGRUENCES
3.1. Introduction
Suppose that m ∈ N and c ∈ Z. Then by Theorem 1A, there exist unique q, r ∈ Z such that
c = mq + r and 0 ≤ r < m. The number r is called the residue of c modulo m, and c is said to belong
to the residue class r modulo m.
We make no notational distinction between numbers r ∈ Z and the residue classes r. We shall use
the convention that whenever r denotes a residue class, this will be explicitly stated in the text.
THEOREM 3A. Suppose that m ∈ N and a, b ∈ Z. Then a ≡ b (mod m) if and only if a and b
belong to the same residue class modulo m.
Proof. Suppose that a ≡ b (mod m). If a belongs to the residue class r modulo m, where r ∈ Z and
0 ≤ r < m, then there exists q1 ∈ Z such that a = mq1 + r. Since a ≡ b (mod m), there exists q ∈ Z
such that b = a + mq. It follows that b = m(q1 + q) + r, and so b also belongs to the residue class r
modulo m.
Conversely, suppose that a and b belong to the same residue class r modulo m, where 0 ≤ r < m.
Then there exist q1 , q2 ∈ Z such that a = mq1 + r and b = mq2 + r. It follows that a − b = m(q1 − q2 ),
and so a ≡ b (mod m).
3–2 W W L Chen : Elementary Number Theory
Suppose that m ∈ N. Consider the set M = {0, 1, 2, . . . , m − 1}. A set S of m integers is said to be a
complete set of residues modulo m if for every integer a ∈ M , there exists a unique element x ∈ S such
that x ≡ a (mod m). It is easy to see that S is a complete set of residues modulo m if and only if S
contains exactly m elements and x ≡ y (mod m) for any distinct x, y ∈ S.
On the other hand, the subset M ∗ = {a ∈ M : (a, m) = 1} has φ(m) elements. A set T of φ(m)
integers is said to be a reduced set of residues modulo m if for every integer a ∈ M ∗ , there exists a
unique element x ∈ T such that x ≡ a (mod m). It is easy to see that T is a reduced set of residues
modulo m if and only if T contains exactly φ(m) elements, all coprime to m, and x ≡ y (mod m) for
any distinct x, y ∈ T .
Examples. (i) The set {2, 4, 6} is a complete set of residues modulo 3. The subset {2, 4} is a reduced
set of residues modulo 3.
(ii) Suppose that p is prime. The set {1, 2, . . . , p} is a complete set of residues modulo p. The subset
{1, 2, . . . , p − 1} is a reduced set of residues modulo p.
Proof. (i) Suppose that S is a complete set of residues modulo m. If x, y ∈ S and x ≡ y (mod m),
then it follows from Theorem 3C(ii) that kx ≡ ky (mod m). Hence the set {kx : x ∈ S} is a set of m
integers that are pairwise incongruent modulo m, and so is a complete set of residues modulo m.
(ii) Suppose that T is a reduced set of residues modulo m. A similar argument shows that the set
{kx : x ∈ T } is a set of φ(m) integers that are pairwise incongruent modulo m. On the other hand, we
know that (kx, m) = 1 whenever (x, m) = 1. It follows that the elements in the set {kx : x ∈ T } are
coprime to m, and so the set is a reduced set of residues modulo m.
Proof. (i) If bx1 + ay1 ≡ bx2 + ay2 (mod ab), then bx1 ≡ bx2 (mod a). It follows from Theorem 3C(ii)
that x1 ≡ x2 (mod a). Similarly y1 ≡ y2 (mod b).
(ii) Since (a, b) = 1, we have φ(ab) = φ(a)φ(b). Suppose that (x, a) = 1 and (y, b) = 1. Then it is
easy to check that
Similarly,
THEOREM 3F. (FERMAT-EULER) Suppose that m ∈ N and a ∈ Z \ {0}, where (a, m) = 1. Then
aφ(m) ≡ 1 (mod m).
Proof. Suppose that r1 , . . . , rφ(m) form a reduced set of residues modulo m. Then it follows from
Theorem 3D that ar1 , . . . , arφ(m) also form a reduced set of residues modulo m. Thus
Clearly (r1 . . . rφ(m) , m) = 1. Hence aφ(m) ≡ 1 (mod m), in view of Theorem 3C(ii).
THEOREM 3G. (FERMAT’S LITTLE THEOREM) Suppose that p is a prime and a ∈ Z, where
p a. Then ap−1 ≡ 1 (mod p).
Suppose that f : Z → Z is a given polynomial with integer coefficients, and m ∈ N. By the number of
solutions of the congruence f (x) ≡ 0 (mod m), we mean the number of elements x in a complete set of
residues modulo m for which the congruence holds; in other words, the number of incongruent numbers
x modulo m for which the congruence holds.
(1) ax ≡ b (mod m)
is soluble if and only if (a, m) | b. In this case, the number of solutions is equal to (a, m), and the
congruence is satisfied by precisely all the numbers in a certain residue class modulo m/(a, m).
Proof. The result is trivial if a = 0, so suppose that a = 0. If (1) is soluble, then there exist x0 , y0 ∈ Z
such that ax0 + my0 = b, and so (a, m) | b. Conversely, suppose that (a, m) | b. Since
a m
, = 1,
(a, m) (a, m)
3–4 W W L Chen : Elementary Number Theory
form a complete set of residues modulo m/(a, m). Hence one of the numbers x0 in the set
m
0, 1, . . . , −1
(a, m)
must satisfy
a b m
(2) x0 ≡ mod ,
(a, m) (a, m) (a, m)
whence
Furthermore, if x ≡ x0 (mod m/(a, m)), then (2) and hence also (3) hold with x0 replaced by x.
To show that the residue class x0 modulo m/(a, m) gives all the solutions, let x be any solution of (1).
Then a(x − x0 ) ≡ 0 (mod m). It follows from Theorem 3C(i) that x − x0 ≡ 0 (mod m/(a, m)).
THEOREM 3J. (CHINESE REMAINDER THEOREM) Suppose that n > 1, and that the natural
numbers m1 , . . . , mn ∈ N are pairwise coprime; in other words, (mi , mj ) = 1 whenever 1 ≤ i < j ≤ n.
Then for any a1 , . . . , an ∈ Z, the simultaneous congruences
x ≡ a1 (mod m1 )
..
.
x ≡ an (mod mn )
Proof. For every j = 1, . . . , n, write qj = m1 . . . mj−1 mj+1 . . . mn . Then (qj , mj ) = 1. It follows from
Theorem 3H that there exists kj ∈ Z such that qj kj ≡ aj (mod mj ). Now let
n
x0 = qj kj .
j=1
THEOREM 3K. Suppose that p is prime. Then for any polynomial f : Z → Z with integer coeffi-
cients, there exists a polynomial g : Z → Z with integer coefficients and of degree less than p such that
f (x) ≡ g(x) (mod p) for every x ∈ Z.
Chapter 3 : Congruences 3–5
Proof. In view of Theorem 3B, it suffices to prove Theorem 3K for the polynomial f (x) = xn , where
n is a fixed positive integer. It is not difficult to show that there exist q, r ∈ Z such that n = (p − 1)q + r
and 1 ≤ r ≤ p − 1. If p x, then it follows from Theorem 3G that
Having reduced the degree of the polynomial, we now show that in many cases, we cannot have too
many solutions.
Proof. The case n = 0 is trivial. The case n = 1 follows from Theorem 3H. Let n > 1 and assume
that the result is true for all polynomials of degree n − 1. Suppose on the contrary that (4) has at least
n + 1 incongruent solutions x0 , x1 , . . . , xn . Then
n
n
f (x) − f (x0 ) = ak (xk − xk0 ) = (x − x0 ) ak (xk−1 + xk−2 x0 + . . . + xk−1
0 ) = (x − x0 )g(x),
k=1 k=1
where g(x) = an x + . . . . It follows that (xj − x0 )g(xj ) ≡ 0 (mod p) for every j = 1, . . . , n, and so
n−1
On the other hand, if a polynomial has many solutions, then we can say quite a lot about its
coefficients.
THEOREM 3M. Suppose that f (x) = an xn + an−1 xn−1 + . . . + a0 is a polynomial with integer
coefficients. Suppose further that p is prime, and the congruence f (x) ≡ 0 (mod p) has more than n
solutions. Then p | aj for every j = 0, 1, . . . , n.
Proof. Suppose on the contrary that some coefficient is not divisible by p. Let k be the largest index
such that p ak . Then k ≤ n. On the other hand, since
We conclude this section by using polynomial congruences to prove an interesting congruence result.
p−1
f (x) = (xp−1 − 1) − (x − m)
m=1
has degree at most (p−2), but has (p−1) roots modulo p, in view of Theorem 3G. It follows from Theorem
3M that all the coefficients are divisible by p. Note that the coefficient of x0 is −1 − (−1)p−1 (p − 1)!.
3–6 W W L Chen : Elementary Number Theory
Remark. We can also prove Wilson’s theorem in the following way. The theorem is obvious if p ≤ 3,
so we assume that p > 3. Suppose that x ≡ 0 (mod p). Then it follows from Theorem 3H that there
exists a unique x modulo p such that xx ≡ 1 (mod p). Moreover, if x ≡ x (mod p), then x ≡ 1 (mod p)
or x ≡ −1 (mod p). It follows that the numbers 2, 3, . . . , p − 2 can be paired off into (p − 3)/2 mutually
reciprocal pairs modulo p, so that (p − 2)! ≡ 1 (mod p). The result follows easily.
Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. Then there exist numbers n ∈ N such that
For example, as shown in Theorem 3F, the number n = φ(m) satisfies the requirement. The smallest
n ∈ N for which the congruence (5) holds is called the exponent to which a belongs modulo m.
THEOREM 3P. Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. If a belongs to the exponent
n modulo m, then the numbers 1, a, a2 , . . . , an−1 are incongruent modulo m.
Proof. Suppose on the contrary that there exist , k ∈ Z such that 0 ≤ < k ≤ n − 1 and a ≡ ak
(mod m). Then ak− ≡ 1 (mod m). But k − < n, and this contradicts the minimality of n.
THEOREM 3Q. Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. Suppose further that a
belongs to the exponent n modulo m, and , k ∈ N ∪ {0}. Then a ≡ ak (mod m) if and only if ≡ k
(mod n). In particular, a ≡ 1 (mod m) if and only if n | .
a = (an )u ar ≡ ar (mod m)
and
An immediate consequence of Theorems 3F and 3Q is that the exponent to which a belongs modulo
m is a divisor of φ(m). However, if the exponent to which a belongs modulo m is actually φ(m), then
we say that a is a primitive root modulo m.
A natural question is then to determine those values of m ∈ N for which primitive roots modulo m
exist. Thanks to Gauss, we have a complete answer to this interesting question.
Our first task is to show that there are certain values of m ∈ N for which primitive roots modulo m
exist. We have the following three results.
THEOREM 3R. Suppose that p is prime. Then for every n ∈ N satisfying n | (p − 1), there are
exactly φ(n) incongruent numbers modulo p which belong to the exponent n modulo p. In particular,
there are φ(p − 1) = φ(φ(p)) primitive roots modulo p.
Chapter 3 : Congruences 3–7
Proof. Suppose that n | (p − 1). Let ψ(n) denote the number of incongruent numbers modulo p which
belong to the exponent n modulo p. We shall show that ψ(n) = φ(n). To see this, let θ(n) denote the
number of solutions of the congruence
By Theorem 3Q, an integer x is a solution of (6) if and only if the exponent k to which x belongs modulo
p satisfies k | n. Hence
θ(n) = ψ(k).
k|n
xp−1 − 1 ≡ 0 (mod p)
has exactly p − 1 solutions. On the other hand, by Lagrange’s theorem, the congruence (6) has at most
n solutions and the congruence
has at most p − 1 − n solutions. It follows that (6) must have exactly n solutions, and so
ψ(k) = n.
k|n
It now follows from the Möbius inversion formula and Theorem 2R that
n
ψ(n) = µ(k) = φ(n).
k
k|n
THEOREM 3S. Suppose that p is an odd prime, and g is a primitive root modulo p. Then there
exists t ∈ Z such that the integer u, defined by the equation
(g + pt)p−1 = 1 + pu,
Proof. Since g p−1 = 1 + pq for some q ∈ Z, it follows that there exist r, s ∈ Z such that
where
As x runs through a complete set of residues modulo p, so does y, in view of Theorem 3D. Hence there
exists a value of x, say t, for which p y, and let u be the corresponding value of y. It follows from (7)
that for this value of t, we have
(g + pt)(p−1)p = (1 + pu)p = 1 + p2 u + p3 u = 1 + p2 u2 ,
3–8 W W L Chen : Elementary Number Theory
where p u2 . Similarly,
2
(g + pt)(p−1)p = 1 + p3 u3 ,
where p u3 , and so on. Suppose that (g +pt) belongs to the exponent n modulo pr , so that (g +pt)n ≡ 1
(mod pr ). Then clearly (g + pt)n ≡ 1 (mod p), and so g n ≡ 1 (mod p). Since g is a primitive root
modulo p, we must have (p − 1) | n. On the other hand, n | φ(pr ) = pr−1 (p − 1). Hence n = ps−1 (p − 1)
for some integer s satisfying 1 ≤ s ≤ r. Recall now that
s−1
(g + pt)n = (g + pt)(p−1)p = 1 + ps u s ,
1 + ps u s ≡ 1 (mod pr ),
THEOREM 3T. Suppose that p is an odd prime, and g is an odd primitive root modulo pr , where
r ∈ N. Then g is a primitive root modulo 2pr .
Remark. Note that since there exist primitive roots modulo pr , there must exist odd primitive roots
modulo pr . To see this, note that if h is an even primitive root modulo pr , then g = h + pr is an odd
primitive root modulo pr .
Proof of Theorem 3T. Note first of all that every odd integer x which satisfies xn ≡ 1 (mod pr )
clearly satisfies xn ≡ 1 (mod 2pr ), and vice versa. It follows that if g is an odd primitive root modulo
pr , then it belongs to the exponent φ(pr ) modulo 2pr . Note, however, that φ(pr ) = φ(2pr ).
We are now in a position to determine precisely those values of m ∈ N for which primitive roots
modulo m exist. We prove the following beautiful result.
THEOREM 3U. (GAUSS) Suppose that m ∈ N and m > 1. Then there exist primitive roots modulo
m if and only if m = 2, 4, pr , 2pr , where p is an odd prime and r ∈ N.
Proof. For m = 4, it is easy to check that 3 is a primitive root. The existence of primitive roots to
the other moduli follows from the previous three theorems.
Suppose now that m = pu1 1 . . . pur r , where the natural numbers p1 < . . . < pr are primes and the
integers ui > 0 for i = 1, . . . , r. For every i = 1, . . . , r, write mi = pui i , so that m = m1 . . . mr , and let
= [φ(m1 ), . . . , φ(mr )] be the least common multiple of φ(m1 ), . . . , φ(mr ). Suppose now that a ∈ Z\{0}
and (a, m) = 1. For every i = 1, . . . , r, we have by Theorem 3F that aφ(mi ) ≡ 1 (mod mi ), so that a ≡ 1
(mod mi ). It follows that a ≡ 1 (mod m). We have to show that if m is not one of the stated values,
then
If p is a prime, then φ(pu ) = pu−1 (p − 1) is even if p > 2 or if p = 2 and u ≥ 2, and so φ(pu ) is even
whenever pu > 2. It follows that if two of the values m1 , . . . , mr exceed 2, then < φ(m). It remains
to show that there are no primitive roots modulo 2u , where u ≥ 3. We shall do this by proving that for
every odd integer a and every integer u ≥ 3, we have
1 u
(8) a 2 φ(2 )
≡1 (mod 2u ).
For u = 3, we note that a2 ≡ 1 (mod 8). Suppose now that (8) holds for u = k; in other words, suppose
that
1 k
)
a 2 φ(2 = 1 + 2k t,
Chapter 3 : Congruences 3–9
2. Prove that every year, including a leap year, has a Friday 13th.
[Hint: For those who are superstitious, prove instead that every year, including a leap year, has a
Sunday 1st. The two statements are the same!]
6. Suppose that a, b, p ∈ N and p > 2 is prime. Show that if ap + bp ≡ 0 (mod p), then we must have
ap + bp ≡ 0 (mod p2 ).
8. Find all x ∈ Z such that simultaneously x ≡ 1 (mod 2), x ≡ 2 (mod 3), x ≡ 3 (mod 5).
m
ax + b m−1
9. Suppose that m ∈ N and a, b ∈ Z such that (a, m) = 1. Prove that = .
x=1
m 2
10. Suppose that m1 , . . . , mk are integers greater than 1 and which are pairwise coprime, and write
m = m1 . . . mk . Suppose further that x1 , . . . , xk , x run through complete sets of residues and
y1 , . . . , yk , y run through reduced sets of residues modulo m1 , . . . , mk , m respectively. Prove that
the fractions
x
x1 xk
+ ... + and
m1 mk m
coincide, as do the fractions
y
y1 yk
+ ... + and .
m1 mk m
d
x
[Hint: Denote the above sum by S(m), and show that for every d ∈ N, we have S(d) = .
x=1
d
(x,d)=1
Then consider the sum S(d), and use the Möbius inversion formula.]
d|m
3–10 W W L Chen : Elementary Number Theory
12. The number g = 10100 is called a googol. Show that there exist a googol consecutive integers each
of which is divisible by the square of a prime.
[Hint: Use the Chinese remainder theorem.]
[Remark: Can you prove that there are arbitrarily long gaps between consecutive primes?]
13. Suppose that p is a prime. Suppose further that h and k are non-negative integers such that
h + k = p − 1. Prove that h!k! + (−1)h ≡ 0 (mod p).
n n
(i) Prove that ≡ (mod p).
p p
n n
(ii) Suppose that α ∈ N and pα divides . Prove that pα also divides .
p p
16. Suppose that n ∈ N, and that there exists a ∈ Z such that n | (an−1 − 1). Suppose further that
n (ax − 1) whenever 1 ≤ x ≤ n − 2. Show that n is prime.
17. Let
p−1
Sn (p) = kn ,
k=1
where p is an odd prime and n is an integer greater than 1. Prove that Sn (p) ≡ −1 (mod p) if
(p − 1) | n, and that Sn (p) ≡ 0 (mod p) if (p − 1) n.
p−2
[Hint: Let g be a primitive root modulo p. Show that Sn (p) ≡ g jn (mod p).]
j=0
18. Suppose that p is a prime. Prove that the sum of the primitive roots modulo p is congruent to
µ(p − 1) modulo p.
19. Suppose that p > 3 is a prime. Prove that the product of the primitive roots modulo p is congruent
to 1 modulo p.
21. Suppose that a, n ∈ N and a > 1. Prove that n divides φ(an − 1).
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 4
QUADRATIC RESIDUES
4.1. Introduction
where p is an odd prime and a ∈ Z. We are interested in determining whether for given p and a, the
congruence (1) has a solution x ∈ Z.
If a ≡ 0 (mod p), then clearly (1) is soluble, with x ≡ 0 (mod p) being the only solution. We
therefore make the assumption that a ≡ 0 (mod p). If (1) is soluble, then we say that a is a quadratic
residue modulo p. If (1) is not soluble, then we say that a is a quadratic non-residue modulo p.
THEOREM 4A. Suppose that p is an odd prime. Then there are precisely (p−1)/2 quadratic residues
modulo p, and these are represented by the numbers
2
p−1
(2) 12 , 22 , . . . , .
2
Proof. Suppose that p a. Then it follows from Lagrange’s theorem that the congruence (1) has at
most two solutions. On the other hand, if x ≡ b (mod p) is a solution, then it is easy to check that
x ≡ p − b (mod p) represents another solution. It follows that the congruence (1) has either two solutions
or no solutions. Note next that any solution of the congruence (1) must be of the form x ≡ b (mod p),
with 1 ≤ b ≤ p − 1. It follows that there can be at most (p − 1)/2 quadratic residues modulo p. It
remains to show that there are at least (p − 1)/2 quadratic residues modulo p. To do so, note that the
(p − 1)/2 numbers in (2) are clearly quadratic residues modulo p. It therefore suffices to show that they
4–2 W W L Chen : Elementary Number Theory
are incongruent modulo p. Suppose on the contrary that x2 ≡ y 2 (mod p), with 1 ≤ x < y ≤ (p − 1)/2.
Then p | (y − x)(y + x), a contradiction since p is prime and 0 < y − x < y + x < p.
It is convenient to introduce the Legendre symbol, defined as follows. Suppose that p is an odd
prime. Then we write
1 if p a and a is a quadratic residue modulo p,
a
= −1 if p a and a is a quadratic non-residue modulo p,
p L
0 if p | a.
In this section, we analyze the Legendre symbol in a systematic way to provide some practical means of
evaluating its value. Our first step is not particularly useful in itself, but provides a path towards results
of a more practical nature.
THEOREM 4B. (EULER’S CRITERION) Suppose that p is an odd prime. For every a ∈ Z, we
have
a p−1
≡a 2 (mod p).
p L
Proof. The result clearly holds if p | a, so we assume now that p a. If a is a quadratic residue
modulo p, then there exists x ∈ Z such that p x and x2 ≡ a (mod p). It follows from Fermat’s little
theorem that
p−1 a
a 2 ≡ xp−1 ≡ 1 = (mod p).
p L
By Fermat’s little theorem, this has p − 1 solutions. On the other hand, by Lagrange’s theorem, neither
p−1
(3) a 2 −1≡0 (mod p)
nor
p−1
(4) a 2 +1≡0 (mod p)
has more than (p − 1)/2 solutions. It follows that each of (3) and (4) has exactly (p − 1)/2 solutions.
The (p − 1)/2 quadratic residues a modulo p all satisfy (3). It follows that all the quadratic non-residues
a must satisfy (4).
THEOREM 4D. Suppose that p is an odd prime. Then for every a, b ∈ Z, we have
ab a b
= .
p L p L p L
Proof. The result is trivial if p | a or p | b, so we assume now that p a and p b. It follows from
Theorem 4B that
ab p−1 p−1 p−1 a b
≡ (ab) 2 ≡ a 2 b 2 ≡ (mod p).
p L p L p L
In practice, Euler’s criterion is not very useful when p is a rather large prime. The following
represents a result of a more practical nature.
THEOREM 4E. (GAUSS’S LEMMA) Suppose that p is an odd prime, and the integer a ∈ Z
satisfies p a. Let
p p ax
m = # x ∈ N : 1 ≤ x < and < ax − p <p ;
2 2 p
in other words, m is the number of integers x satisfying 1 ≤ x < p/2 for which the residue rx of ax
satisfies p/2 < rx < p. Then
a
= (−1)m .
p L
Let α1 , . . . , αm denote the m values of rx for which p/2 < rx < p, and let β1 , . . . , β , where + m =
(p − 1)/2, denote the values of rx for which 0 < rx < p/2. Then
p−1
m
m
2
(6) rx = αi βj ≡ (−1)m (p − αi ) βj (mod p).
x=1 i=1 j=1 i=1 j=1
4–4 W W L Chen : Elementary Number Theory
Clearly, for every i = 1, . . . , m, we have 0 < p − αi < p/2. Also, for every j = 1, . . . , , we have
0 < βj < p/2. Note also that the numbers α1 , . . . , αm are distinct, and the numbers β1 , . . . , β are also
distinct. Furthermore, for every i = 1, . . . , m and every j = 1, . . . , , the numbers p − αi and βj are
different, for p − αi = βj would give ax ≡ −ay (mod p), and hence x + y ≡ 0 (mod p), for some x, y ∈ Z
satisfying 1 ≤ x < y ≤ (p − 1)/2, clearly impossible. Hence
m
p−1
(7) (p − αi )
βj = !.
i=1 j=1
2
Proof. The numbers 2, 4, 6, . . . , p − 1 all lie between 0 and p, and so are their own residues modulo
p. Moreover, p/2 < 2x < p if and only if p/4 < x < p/2. Hence we must have m = [p/2] − [p/4]. The
second equality is obtained by checking.
Suppose that p, q ∈ N are distinct odd primes. There is a beautiful result which links the solubility of
the two quadratic congruences
in the sense that if we know whether one of these two congruences is soluble, then the determination of
whether the other congruence is soluble involves only a simple calculation.
THEOREM 4G. (LAW OF QUADRATIC RECIPROCITY) Suppose that p, q ∈ N are distinct odd
primes. Then
q p p−1 q−1
= (−1)( 2 )( 2 ) .
p L q L
THEOREM 4H. Suppose that p is an odd prime, and the integer a ∈ Z satisfies p a. Then
a
= (−1)n ,
p L
where
p−1
2ay
n= .
y=1
p
Proof. We shall use Gauss’s lemma. In the notation of Theorem 4E, we have
p−1 2ax ax
m=# x∈N:1≤x≤ and 1 < −2 <2 .
2 p p
Chapter 4 : Quadratic Residues 4–5
2ax ax
0< −2 < 2.
p p
Hence
p−1
2ax ax
m= −2 ≡n (mod 2).
x=1
p p
where
p−1
qx
λ(p, q) = .
x=1
p
where
p−1
(a + p)y
r= .
y=1
p
Now
2
2
2
2
p−1
q qy
= (−1)s , where s= .
p L y=1
p
THEOREM 4K. Suppose that p, q ∈ N are distinct odd primes. Then in the notation of Theorem
4J, we have
p−1 q−1
λ(p, q) + λ(q, p) = .
2 2
Proof. We have
qx
λ(p, q) = = 1= 1,
p p
1≤x< 2 p
1≤x< 2 1≤y< qx
p 1≤y< q2 py p
q <x< 2
It follows that
p−1 q−1
λ(p, q) + λ(q, p) = 1= ,
2 2
1≤y< q2 1≤x< p
2
Example. The numbers 8783 and 15671 are prime. We want to determine the number of solutions of
the congruence x2 ≡ 8783 (mod 15671). We have
8783 ( 8783−1
)( 15671−1
) 15671 15671 6888
= (−1) 2 2 =− =−
15671 L 8783 L 8783 L 8783 L
3
2 3 7 41 2 3 7 41
=− =− .
8783 L 8783 L 8783 L 8783 L 8783 L 8783 L 8783 L 8783 L
8783 L
3 ( 3−1
)( 8783−1
) 8783
= (−1) 2 2 ,
8783 L 3 L
7 7−1 8783−1 8783
= (−1)( 2 )( 2 ) ,
8783 L 7 L
41 41−1 8783−1 8783
= (−1)( 2 )( 2 ) .
8783 L 41 L
It follows that
2
8783 8783 8783 8783 2 5 9 9−1 5 3
=− =− = −(−1) 8
15671 3 L 7 L 41 L 3 L 7 L 41 L 7 L 41 L
L
5 ( 5−1
)( 7−1
) 7 7 2 25−1
= = (−1) 2 2 = = = (−1) 8 = −1.
7 L 5 L 5 L 5 L
To shorten many calculations involving the Legendre symbol, we introduce the Jacobi symbol which can
be considered in some way to be a generalization of the Legendre symbol. For every n ∈ Z, we write
n
= 1.
1 J
If m is a positive odd integer with canonical decomposition m = pu1 1 . . . pur r , where p1 , . . . , pr are distinct
odd primes, then we write
n r
u
n i
= .
m J
i=1
pi L
Remark. We emphasize immediately that the Jacobi symbol is for calculation only. In particular,
note that
n
=1
m J
does not necessarily imply that the congruence x2 ≡ n (mod m) is soluble. Consider, for example, the
case when n = 2 and m = 15.
The following observations can be deduced from the properties of the Legendre symbol. We leave
the proof as an exercise for the reader.
THEOREM 4L. Suppose that m and m are odd positive integers. Then for every n, n ∈ Z, we have
n n
nn
(i) = ;
m J m J m J
n n n
(ii) = ;
m J m J mm J
n
n
(iii) = whenever n ≡ n (mod m); and
m J m J
2 n
a n
(iv) = whenever (a, m) = 1.
m J m J
Proof. It is convenient to write m = p1 . . . ps , where the prime factors are not necessarily distinct.
Then
s
s
s
s
s
m= (1 + pj − 1) = 1 + (pj − 1) + (pj − 1)(pk − 1) + . . . ≡ 1 + (pj − 1) (mod 4),
j=1 j=1 j=1 k=1 j=1
j=k
and so
m − 1 pj − 1
s
≡ (mod 2).
2 j=1
2
4–8 W W L Chen : Elementary Number Theory
Thus
s
s
−1 −1 pj −1 m−1
= = (−1) 2 = (−1) 2 ,
m J j=1
pj L j=1
s
s
s
s
s
m2 = (1 + p2j − 1) = 1 + (p2j − 1) + (p2j − 1)(p2k − 1) + . . . ≡ 1 + (p2j − 1) (mod 16),
j=1 j=1 j=1 k=1 j=1
j=k
and so
m2 − 1 p2j − 1
s
≡ (mod 2).
8 j=1
8
Thus
s
s p2 −1
2 2 j m2 −1
= = (−1) 8 = (−1) 8 ,
m J j=1
pj L j=1
We leave it as an exercise for the reader to prove the following reciprocity result.
THEOREM 4N. Suppose that m and n are odd positive integers and (m, n) = 1. Then
m n m−1
= (−1)( 2 )( n−1
2 ).
n J m J
Example. Let us consider our earlier example again. Recall that we want to determine the number
of solutions of the congruence x2 ≡ 8783 (mod 15671), where the numbers 8783 and 15671 are prime.
Omitting the details of a few steps from earlier, we have
3
8783 6888 2 861 861 861
=− =− =− =−
15671 8783 L 8783 L 8783 L 8783 L 8783 J
L
861−1 8783−1 8783 8783 173
= −(−1)( 2 )( 2 ) =− =−
861 J 861 J 861 J
173−1 861−1 861 861 −4
= −(−1)( 2 )( 2 ) =− =−
173 J 173 L 173 L
2
−1 2 −1 173−1
=− =− = −(−1) 2 = −1.
173 L 173 L 173 L
Alternatively, try to fill in the missing details in the argument below. We have
8783 15671 −1895 8783
=− =− =−
15671 8783 L 8783 L 1895 J
L
8783 8783 379 16
=− =− =− = −1.
5 J 379 J 33 J 33 J
Chapter 4 : Quadratic Residues 4–9
Suppose that the prime p satisfies p ≡ 1 (mod 8(k!)), where k ∈ N. Then it is not difficult to see that 2
is a quadratic residue modulo p. Furthermore, for any odd prime q ∈ N such that q ≤ k and q = p, we
have
q p−1 q−1 p 1
= (−1)( 2 )( 2 ) = = 1,
p L q L q L
so that q is a quadratic residue modulo p. Suppose now that n ∈ N satisfies n ≤ k. Then all the prime
factors of n do not exceed k. It follows from Theorem 4D that n is a quadratic residue modulo p.
Now let np denote the least positive quadratic non-residue modulo p. For the prime p above, we
have np > k. It follows that
lim sup np = ∞.
p→∞
In 1919, Vinogradov conjectured that for any > 0, we have np p as p → ∞. Here we prove the
following weaker result.
Proof. Let h = [p/np ] + 1. Then p < hnp < p + np , so that (hnp /p)L = 1. Since (np /p)L = −1, it
follows from Theorem 4D that (h/p)L = −1. Note now that since 0 < h < p/2 + 1 < p, we must have
1 ≤ h < p, so that h ≥ np . We therefore conclude that
p p
np ≤ +1≤ + 1.
np np
2. (i) Show that 3 is a quadratic residue for primes of the form 12k ± 1 and a quadratic non-residue
for primes of the form 12k ± 5.
(ii) Deduce that −3 is a quadratic residue for primes of the form 6k + 1 and a quadratic non-residue
for primes of the form 6k − 1.
(iii) By considering x2 + 3, show that there are infinitely many primes of the form 6k + 1.
3. Suppose that p is an odd prime. Suppose further that the set {1, 2, . . . , p − 1} can be expressed as
the union of two non-empty subsets S and T such that
• S = T ;
• the product modulo p of any two elements in the same set lies in S; and
• the product modulo p of any element in S with any element in T lies in T .
Prove that S consists of the quadratic residues modulo p, and that T consists of the quadratic
non-residues modulo p.
4–10 W W L Chen : Elementary Number Theory
4. (i) Prove that 3 is a primitive root of any prime of the form 2n + 1, where n > 1 is an integer.
(ii) Prove that 2 is a primitive root of any prime of the form 2p + 1, where p is a prime of the form
4n + 1.
(iii) Prove that −2 is a primitive root of any prime of the form 2p + 1, where p is a prime of the
form 4n + 3.
(iv) Prove that 2 is a primitive root of any prime of the form 4p + 1, where p is a prime.
(v) Prove that 3 is a primitive root of any prime of the form 2n p + 1, where n > 1 is an integer
n
and the prime p > (32 − 1)/2n .
7. Suppose that the prime p ≡ 3 (mod 4), and that a is a quadratic residue modulo p. Show that the
p+1
solutions of the congruence x2 ≡ a (mod p) are given by x = ±a 4 (mod p).
8. Suppose that the prime p ≡ 5 (mod 8), and that a is a quadratic residue modulo p.
p−1
(i) Suppose that a 4 ≡ 1 (mod p). Show that the solutions of the congruence x2 ≡ a (mod p)
p+3
are given by x ≡ ±a 8 (mod p).
p−1
(ii) Suppose that a 4 ≡ −1 (mod p). Show that the solutions of the congruence x2 ≡ a (mod p)
p−1 p+3
are given by x ≡ ±2 4 a 8 (mod p).
11. Suppose that p is an odd prime. Suppose further that a, b ∈ Z and p a. Prove that
p
an + b
= 0.
n=1
p L
12. Suppose that p is an odd positive prime. Suppose further that k ∈ Z and p k.
(i) Prove that
p−1
p−1
p−1 p−1
n(n + k) n(n + knn ) 1 + kn
= = .
n=1
p L n=1 n =1
p L
p L
n =1
nn ≡1 (mod p)
p−2
p−2
p−2 p−2
n(n + k) n(n + knn ) 1 + kn
= = .
n=1
p L n=1 n =1
p L
p L
n =1
nn ≡1 (mod p)
Chapter 4 : Quadratic Residues 4–11
p−2
1 n n+1 1 −1
A(R, R) = 1+ 1+ = p−4− .
4 n=1 p L p L 4 p L
(ii) Let A(R, N ) denote the number of integers n satisfying 1 ≤ n ≤ p − 2 such that n is a
quadratic residue and n + 1 is a quadratic non-residue modulo p. By first considering the sum
A(R, R) + A(R, N ), find A(R, N ).
(iii) Let A(N, R) denote the number of integers n satisfying 1 ≤ n ≤ p − 2 such that n is a
quadratic non-residue and n + 1 is a quadratic residue modulo p. By first considering the sum
A(R, R) + A(N, R), find A(N, R).
(iv) Hence determine A(N, N ), the number of integers n satisfying 1 ≤ n ≤ p − 2 such that both n
and n + 1 are quadratic non-residues modulo p.
p
m−k m
S(k) = 1+ = −1.
m=1
p L p L
show that
a
− p if p (b2 − 4ac),
L
T (a, b, c) =
a
(p − 1) if p | (b2 − 4ac).
p L
16. Suppose that f (x) is a polynomial with integer coefficients. Suppose further that a, b ∈ Z and p is
a prime, and R denotes a complete set of residues modulo p.
(i) Suppose that (a, p) = 1. Prove that
f (ax + b) f (x)
= .
p L p L
x∈R x∈R
4–12 W W L Chen : Elementary Number Theory
ax + b
= 0.
p L
x∈R
p−1
x(ax + b) a
=− .
x=1
p L p L
17. Suppose that the primes p ≡ 1 (mod 4) and q ≡ 3 (mod 4). Prove each of the following:
p−1
n
(i) n = 0.
n=1
p L
p−1
p(p − 1)
(ii) n= .
n=1
4
(n/p)L =1
q−1 q−1
n n
(iii) n2
=q n .
n=1
q L n=1
q L
3p 2 n
p−1 p−1
3 n
(iv) n = n .
n=1
p L 2 n=1 p L
q−1
q−1
q−1
4 n 3 n 2 n
(v) n = 2q n −q 2
n .
n=1
q L n=1
q L n=1
q L
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 5
SUMS OF INTEGER SQUARES
In this section, we shall characterize all natural numbers which are representable as the sum of two
integer squares. In other words, we shall determine all numbers n ∈ N such that the equation
n = x21 + x22
is soluble in x1 , x2 ∈ Z.
The first step in our argument is provided by the following result on the special case when n is a
prime congruent to 1 modulo 4.
THEOREM 5A. (FERMAT) Suppose that p is prime and p ≡ 1 (mod 4). Then p is representable
as the sum of two integer squares; in other words, there exist x1 , x2 ∈ Z such that p = x21 + x22 .
We shall first give the original proof by Fermat using his method of descent. In the next section, we
shall give an alternative proof by Thue which contains ideas that we can develop further to study the
number of representations of a natural number as a sum of two integer squares.
Proof of Theorem 5A. Since p ≡ 1 (mod 4), it follows from Theorem 4C that (−1/p)L = 1, and so
−1 is a quadratic residue modulo p. The numbers
p−1 p−1
− , . . . , −1, 0, 1, . . . ,
2 2
form a complete set of residues modulo p. It follows that one of the elements, x0 say, satisfies x20 + 1 ≡ 0
(mod p). Since |x0 | < p/2, we must have
p 2
p ≤ x20 + 1 < + 1 < p2 .
2
5–2 W W L Chen : Elementary Number Theory
In particular, there exists m ∈ N satisfying 1 ≤ m < p such that mp can be expressed as a sum of two
integer squares. It now suffices to show that the least positive multiple of p which can be expressed as
a sum of two integer squares must be p itself.
We shall prove this by showing that if mp, where 1 < m < p, is a sum of two integer squares, then
there exists m0 ∈ N satisfying 1 ≤ m0 < m such that m0 p is also a sum of two integer squares. Suppose
now that 1 < m < p and
By (1)–(3), we have
and
x1 y2 − x2 y1 ≡ x1 x2 − x2 x1 ≡ 0 (mod m).
It follows that each term on the right hand side of (5) is divisible by m2 , so that
2 2
x1 y1 + x2 y2 x1 y2 − x2 y1
m0 p = + ,
m m
We now determine all the natural numbers which are sums of two integer squares.
THEOREM 5B. Suppose that n ∈ N and n > 1, and the canonical decomposition of n is given by
Suppose on the contrary that s1 is odd. If q1 x2 , then it follows from Theorem 3H that there exists
x ∈ Z such that x2 x ≡ 1 (mod q1 ). Multiplying (6) by x2 gives (x1 x)2 ≡ −1 (mod q1 ), impossible since
−1 is a quadratic non-residue modulo q1 . It follows that q1 | x2 , and so q1 | x1 also. Writing x1 = q1 y1
and x2 = q1 y2 , we have n = q12 (y12 +y22 ). Hence s1 ≥ 3. Repeating the argument on n/q12 yields s1 −2 ≥ 3.
Repeating the argument a sufficient number of times leads eventually to a contradiction. It follows that
s1 must be even. A similar argument shows that s2 , . . . , s are all even.
on noting that we can apply Theorem 5A to each of the primes p1 , . . . , pk , that 2 is a sum of two integer
squares, and that qj2 = qj2 + 02 is a sum of two integer squares for every j = 1, . . . , .
A natural question that arises concerns the number of ways any given n ∈ N can be represented as a
sum of two integer squares. Our starting point is the following alternative proof of Fermat’s theorem by
Thue.
p−1
Second Proof of Theorem 5A. Let x = 2 !. Since 4 | (p − 1), it follows that (p − 1)/2 is an
even integer, and so
p−1 p−1 p−1
p−1
2
2
2
2
x = (−1) 2 2
r = r(−r) ≡ r(p − r) = (p − 1)! ≡ −1 (mod p)
r=1 r=1 r=1
(8) x2 + 1 ≡ 0 (mod p)
is soluble. We shall now show that if x ∈ Z is a solution of (8), then there exist a, b ∈ Z such that
(9) |a| < p1/2 , |b| < p1/2 , ab = 0 and ax ≡ b (mod p).
so that a2 + b2 = p. To prove (9), consider the numbers of the form ux − v, where u, v ∈ Z satisfy
0 ≤ u, v ≤ p1/2 . There are ([p1/2 ] + 1)2 > p choices of such numbers u and v, and only p residue classes
modulo p. It follows from Dirichlet’s box principle that there exist two such pairs u , v and u , v such
that u x − v and u x − v belong to the same residue class modulo p and so are congruent to each other
modulo p. Now let a = u − u and b = v − v . Then ax − b = (u − u )x − (v − v ) ≡ 0 (mod p).
Clearly we have |a| < p1/2 and |b| < p1/2 . Finally, if b = 0, then we must have a ≡ 0 (mod p), and so
a = 0, a contradiction. Hence b = 0. Similarly a = 0.
Our first step towards finding a formula for the number of representations of a natural number as
a sum of two integer squares is the following generalization of the above proof of Fermat’s theorem.
5–4 W W L Chen : Elementary Number Theory
THEOREM 5C. Suppose that n ∈ N and n > 1. For every solution x ∈ Z of the congruence
x2 + 1 ≡ 0 (mod n), there exist unique positive integers a, b ∈ N such that
If β < 0, then we let a = −β and b = α. Again we have a2 + b2 = n. On the other hand, we have
bx ≡ −a (mod n), so that bx2 ≡ −ax (mod n). It now follows from the assumption x2 ≡ −1 (mod n)
that ax ≡ b (mod n).
x2 + 1 = kn and b = ax + n.
It follows that
Finally, to show uniqueness, suppose that the conclusion holds also for the pair A, B ∈ N. Then
THEOREM 5D. Suppose that n ∈ N, and T (n) is equal to the number of solutions of the congruence
x2 + 1 ≡ 0 (mod n). Then the number of solutions of the equation n = a2 + b2 with (a, b) = 1 is equal
to 4T (n).
Proof. Suppose first of all that n = 1. Clearly T (1) = 1 and the equation 1 = a2 + b2 has four
solutions, namely (a, b) = (±1, 0) and (a, b) = (0, ±1).
Suppose now that n > 1. We have already shown for every solution x ∈ Z of the congruence
x2 + 1 ≡ 0 (mod n), there exist unique positive integers a, b ∈ N such that
Conversely, suppose that a, b ∈ N satisfy (a, b) = 1 and n = a2 + b2 . It is easy to see that (a, n) = 1, and
so the congruence ax ≡ b (mod n) has unique solution.
Chapter 5 : Sums of Integer Squares 5–5
The above establishes a one-to-one correspondence between the solutions of the congruence x2 +1 ≡ 0
(mod n) and numbers a, b ∈ N such that (a, b) = 1 and n = a2 + b2 . The factor 4 occurs if we permit
negative values for a and b.
THEOREM 5E. Suppose that n ∈ N, and T (n) is equal to the number of solutions of the congruence
x2 + 1 ≡ 0 (mod n). Then T (n) = 0 if 4 | n or if n is divisible by a prime q ≡ 3 (mod 4). Otherwise we
have T (n) = 2k , where k is the number of distinct odd prime factors of n.
Proof. Clearly the result is valid if n = 1, so we shall assume that n > 1. Using the Chinese
remainder theorem, one can show that T (n) is a multiplicative function. It follows that if the canonical
decomposition of n is given by
It is easy to check that T (2) = 1. On the other hand, the congruence x2 ≡ −1 (mod 4) has no
solutions, and so the congruence x2 ≡ −1 (mod 2r ) has no solutions for any r ≥ 2. Hence T (2r ) = 0 for
every r ≥ 2, and so T (n) = 0 if 4 | n.
Suppose next that q ∈ N is a prime satisfying q ≡ 3 (mod 4). Since −1 is a quadratic non-residue
modulo q, it follows that the congruence x2 ≡ −1 (mod q) has no solutions, and so the congruence
x2 ≡ −1 (mod q s ) has no solutions for any s ≥ 1. Hence T (q s ) = 0 for every s ≥ 1, and so T (n) = 0 if
q | n.
To complete the proof, it suffices to show that for every prime p satisfying p ≡ 1 (mod 4), we have
T (pr ) = 2 for every r ≥ 1. Suppose that r ∈ N is fixed. Then any solution of the congruence x2 ≡ −1
(mod pr ) can be assumed to be an element in the set
Now any x ∈ R must satisfy the congruence x2 ≡ m (mod pr ) for some number m ∈ N satisfying
0 < m < pr and (m/p)L = 1. There are 12 (p − 1) numbers m ∈ N satisfying 0 < m < p and (m/p)L = 1,
and so there are 12 (p − 1)pr−1 = 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr and (m/p)L = 1. Suppose
now that x2 ≡ y 2 (mod pr ) and p x. Then pr | (x + y)(x − y), so that p | (x + y) or p | (x − y);
but not both, for otherwise p must divide their sum 2x, a contradiction. It follows that pr | (x + y) or
pr | (x−y), and so x ≡ ±y (mod pr ). Hence for each of the 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr
and (m/p)L = 1, there are at most two numbers x ∈ R such that x2 ≡ m (mod pr ). Since R contains
precisely φ(pr ) elements, it follows that for each of the 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr
and (m/p)L = 1, there are precisely two numbers x ∈ R such that x2 ≡ m (mod pr ). Note now that
−1 is a quadratic residue modulo p. It follows that there are precisely two numbers x ∈ R such that
x2 ≡ −1 (mod pr ), and so T (pr ) = 2.
THEOREM 5F. Suppose that n ∈ N, and S(n) is equal to the number of solutions of the equation
n = a2 + b2 in numbers a, b ∈ Z. Then
n
S(n) = 4 T 2 ,
d
d |n
2
where for every n ∈ N, the number T (n) is equal to the number of solutions of the congruence x2 + 1 ≡ 0
(mod n).
5–6 W W L Chen : Elementary Number Theory
Proof. Suppose that a, b ∈ Z satisfy n = a2 + b2 . Write d = (a, b). Then d2 | n. If we write a1 = a/d
and b1 = b/d, then the pair a1 , b1 satisfy
n
(10) = a21 + b21 and (a1 , b1 ) = 1.
d2
On the other hand, suppose that d2 | n, and that a1 , b1 ∈ Z satisfy (10). If we write a = da1 and b = db1 ,
then (a, b) = d and n = a2 + b2 . We can therefore identity any pair a, b ∈ Z satisfying n = a2 + b2 with
the pair a1 , b1 ∈ Z satisfying (10), where d = (a, b). The result now follows on noting Theorem 5D.
THEOREM 5G. Suppose that n ∈ N, and S(n) is equal to the number of solutions of the equation
n = a2 + b2 in numbers a, b ∈ Z. Then
S(n) = 4 χ(m),
m|n
It is easy to show that the function χ(n) is multiplicative, so it follows from Theorem 2A that the
function W (n) is also multiplicative. On the other hand, recall that the function T (n) is multiplicative.
It follows from Theorem 5F, in a way similar to the proof of Theorem 2A, that if (n1 , n2 ) = 1, then
S(pr )
= W (pr ),
4
since the result is obvious for n = 1.
S(pr ) pr
= T .
4 d2
d |p
2 r
If r is even, then
S(p )r 1 if p = 2,
= T (pr ) + T (pr−2 ) + . . . + T (p2 ) + T (1) = 1 if p ≡ 3 (mod 4),
4
r + 1 if p ≡ 1 (mod 4).
If r is odd, then
S(pr ) 1 if p = 2,
= T (pr ) + T (pr−2 ) + . . . + T (p) = 0 if p ≡ 3 (mod 4),
4
r + 1 if p ≡ 1 (mod 4).
Chapter 5 : Sums of Integer Squares 5–7
Hence
1 if p = 2,
S(p ) 1
r
if p ≡ 3 (mod 4) and r is even,
=
4 0
if p ≡ 3 (mod 4) and r is odd,
r + 1 if p ≡ 1 (mod 4).
Consider next
We now study the problem of representing natural numbers as sums of four integer squares, and show
that this is always possible.
THEOREM 5H. (LAGRANGE) Every n ∈ N is representable as the sum of four integer squares;
in other words, for every n ∈ N, there exist x1 , x2 , x3 , x4 ∈ Z such that n = x21 + x22 + x23 + x24 .
it suffices to show that every prime can be expressed as a sum of four integer squares. Clearly we have
2 = 12 + 12 + 02 + 02 . Also, it follows from Theorem 5A that every prime p ≡ 1 (mod 4) is a sum of four
integer squares. It therefore remains to prove that every prime q ≡ 3 (mod 4) is a sum of four integer
squares. Naturally the number 1 is a quadratic residue modulo q. Let a ∈ N be the smallest number in
the range 1 ≤ a ≤ q − 2 such that a + 1 is a quadratic non-residue modulo q, so that
a+1 a
= −1 and = 1.
q L q L
Since q ≡ 3 (mod 4), it follows from Theorem 4C that (−1/q)L = −1, and so
−a − 1 −1 a+1
= = 1.
q L q L q L
q−1 q−1
− , . . . , −1, 0, 1, . . . ,
2 2
5–8 W W L Chen : Elementary Number Theory
of residues modulo q such that x21 ≡ a (mod q) and x22 ≡ −a − 1 (mod q). Hence
and
q 2
q ≤ x21 + x22 + 1 < 2 + 1 < q2 .
2
In particular, there exists m ∈ N satisfying 1 ≤ m < q such that mq can be expressed as a sum of four
integer squares. It now suffices to show that the least positive multiple of q which can be expressed as
a sum of four integer squares must be q itself.
We shall prove this by showing that if mq, where 1 < m < q, is a sum of four integer squares, then
there exists m0 ∈ N satisfying 1 ≤ m0 < m such that m0 q is also a sum of four integer squares. Suppose
now that 1 < m < q and
where x1 , x2 , x3 , x4 ∈ Z. If m is even, then the right hand side of (12) must be even. It follows that an
even number of the four terms x1 , x2 , x3 , x4 must be even, so we may assume without loss of generality
that x1 ≡ x2 (mod 2) and x3 ≡ x4 (mod 2). In particular, we have
2 2 2 2
m x1 + x2 x1 − x2 x3 + x4 x3 − x4
q= + + + .
2 2 2 2 2
y12 + y22 + y32 + y42 ≡ x21 + x22 + x23 + x24 ≡ 0 (mod m),
Also each of the terms (x1 y2 −x2 y1 +x3 y4 −x4 y3 ), (x1 y3 −x3 y1 −x2 y4 +x4 y2 ) and (x1 y4 −x4 y1 −x3 y2 +x2 y3 )
is congruent to 0 modulo q. It follows that each term on the right hand side of (15) is divisible by m2 ,
so that m0 q is a sum of four integer squares.
Chapter 5 : Sums of Integer Squares 5–9
The situation is very different in the case of three integer squares. The main reason is that there is no
analogue of (7) and (11) in this case. Here, we shall only concern ourselves with the following simple
theorem.
THEOREM 5J. No integer of the form 4k (8m + 7), where k, m ∈ N ∪ {0}, can be represented as a
sum of three squares.
Proof. Note first of all that every integer square is congruent to 0, 1, 4 modulo 8, so that (8m + 7) is
never a sum of three squares for any m ∈ Z. Hence the conclusion of Theorem 5J is true for k = 0.
We now proceed by induction on k. Suppose that 4s (8m + 7) is never a sum of three squares for
any m ∈ Z. We shall show that 4s+1 (8m + 7) is never a sum of three squares for any m ∈ Z. Suppose
on the contrary that
It was proved by Legendre in 1798 that all other natural numbers are representable as sums of three
integer squares.
1. Suppose that the natural number n has canonical decomposition n = 2r pr11 . . . prkk q1s1 . . . qs , where
the integer r ≥ 0 and r1 , . . . , rk , s1 , . . . , s ∈ N, and p1 , . . . , pk , q1 , . . . , q are primes satisfying
p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4). Suppose further that m = pr11 . . . prkk .
(i) Show that the function S(n) defined in Theorem 5F satisfies
0 if at least one of s1 , . . . , s is odd,
S(n) =
4d(m) otherwise.
(ii) Suppose that s1 , . . . , s are all even. Show that the number of solutions of the equation
n = x2 + y 2 in x, y ∈ Z satisfying x ≥ y ≥ 0,
is equal to the number of solutions of the equation
m = xy in x, y ∈ Z satisfying x ≥ y > 0.
[Hint: Show that both are equal to [ 12 d(m) + 12 ].]
2. Suppose that n ∈ N. Show that if the equation n = x2 + y 2 has more than one solution in x, y ∈ N
with x even, then n is composite.
3. Show that for every real number X > 0, the function S(n) defined in Theorem 5F satisfies the
inequalities
πX − 4X 1/2 − 4 < S(n) < πX + 4X 1/2 .
n≤X
4. We know that if each of two non-negative integers is a sum of two squares of non-negative integers,
then so is their product. Show that the analogous assertion for three squares is false.
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 6
ELEMENTARY PRIME NUMBER THEORY
We have already seen the elegant and simple proof of Euclid’s theorem, that there are infinitely many
primes. Here we shall begin by proving a slightly stronger result.
p
p
is divergent.
1
−1
PX = 1− .
p
p≤X
Then
1
log PX = − log 1 − = S1 + S2 ,
p
p≤X
where
1 ∞
1
S1 = and S2 = .
p hph
p≤X p≤X h=2
6–2 W W L Chen : Elementary Number Theory
Since
∞
∞
1 1 1
0≤ ≤ = ,
hp h p h p(p − 1)
h=2 h=2
we have
∞
1 1
0 ≤ S2 ≤ ≤ = 1,
p
p(p − 1) n=2 n(n − 1)
so that π(X) denotes the number of primes in the interval [2, X]. This function has been studied
extensively by number theorists, and attempts to study it in depth have led to major developments in
other important branches of mathematics.
As can be expected, many conjectures concerning the distribution of primes were made based purely
on numerical evidence, including the celebrated Prime number theorem, proved in 1896 by Hadamard
and de la Vallée Poussin, that
π(X) log X
lim = 1.
X→∞ X
We shall not prove this in these lectures. Instead we shall be concerned with the weaker result of
Tchebycheff, that there exist positive absolute constants c1 and c2 such that for every real number
X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X
The study of the function π(X) usually involves, instead of the characteristic function of the primes, a
function which counts not only primes, but prime powers as well, and with weights. Accordingly, we
introduce the von Mangoldt function Λ : N → C, defined for every n ∈ N by writing
log p if n = pr , with p prime and r ∈ N,
Λ(n) =
0 otherwise.
Proof. The result is clearly true for n = 1, so it remains to consider the case n ≥ 2. Suppose that
n = pu1 1 . . . pur r is the canonical decomposition of n. Then the only non-zero contribution to the sum
v
on the left hand side comes from those natural numbers m of the form m = pj j with j = 1, . . . , r and
1 ≤ vj ≤ uj . It follows that
r
uj
r
u
Λ(m) = log pj = log pj j = log n.
m|n j=1 vj =1 j=1
X
Λ(m) = X log X − X + O(log X).
m
m≤X
To prove (1), note that log X is an increasing function of X. In particular, for every n ∈ N, we have
n+1
log n ≤ log u du,
n
so that
X
log n − log(X + 1) ≤ log u du.
n≤X 1
so that
[X]
X
X
X
log n = log n ≥ log u du = log u du − log u du ≥ log u du − log X.
n≤X 2≤n≤X 1 1 [X] 1
The crucial step in the proof of Tchebycheff’s theorem concerns obtaining bounds on sums involving the
von Mangoldt function. More precisely, we prove the following result.
THEOREM 6D. There exist positive absolute constants c3 and c4 such that
1
(2) Λ(m) ≥ X log 2 if X ≥ c3 ,
2
m≤X
and
(3) Λ(m) ≤ c4 X if X ≥ 0.
X
2 <m≤X
Proof. If m ∈ N satisfies X/2 < m ≤ X, then clearly [X/2m] = 0. It follows from this and Theorem
6C that as X → ∞, we have
X X X X
Λ(m) −2 = Λ(m) −2 Λ(m)
m 2m m 2m
m≤X m≤X m≤ X
2
X X X
= (X log X − X + O(log X)) − 2 log − + O(log X) = X log 2 + O(log X).
2 2 2
Hence there exists a positive absolute constant c5 such that for all sufficiently large X, we have
1 X X
X log 2 < Λ(m) −2 < c5 X.
2 m 2m
m≤X
We now consider the function [α] − 2[α/2]. Clearly [α] − 2[α/2] < α − 2(α/2 − 1) = 2. Note that the
left hand side is an integer, so we must have [α] − 2[α/2] ≤ 1. It follows that for all sufficiently large X,
we have
1
X log 2 < Λ(m).
2
m≤X
The inequality (2) follows. On the other hand, if X/2 < m ≤ X, then [X/m] = 1 and [X/2m] = 0, so
that for all sufficiently large X, we have
Λ(m) ≤ c5 X.
X
2 <m≤X
THEOREM 6E. (TCHEBYCHEFF) There exist positive absolute constants c1 and c2 such that for
every real number X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X
To prove the upper bound, note that in view of (3) and the definition of the von Mangoldt function,
the inequality
X
log p ≤ c4
X
2j
<p≤ Xj
2j+1 2
holds for every integer j ≥ 0 and every real number X ≥ 0. Suppose that X ≥ 2. Let the integer k ≥ 0
be defined such that 2k < X 1/2 ≤ 2k+1 . Then
k
k
log p ≤ log p ≤ c4 X 2−j < 2c4 X,
X 1/2 <p≤X j=0 X
<p≤ Xj j=0
2j+1 2
so that
log p 4c4 X
1≤ 1/2
< ,
log X log X
X 1/2 <p≤X X 1/2 <p≤X
whence
4c4 X c2 X
π(X) ≤ X 1/2 + <
log X log X
for a suitable c2 .
log p
(5) = log X + O(1),
p
p≤X
and
1
(6) = log log X + O(1).
p
p≤X
∞
Λ(m) ≤ Λ(m) ≤ 2c4 X,
m≤X j=0 X
<m≤ Xj
2j+1 2
so that as X → ∞, we have
Λ(m)
X = X log X + O(X).
m
m≤X
As X → ∞, we have
∞
log p ∞
1 1 log n
(log p) ≤ (log p) = ≤ = O(1).
p k p k p(p − 1) n=2 n(n − 1)
p≤X 2≤k≤ log X p≤X k=2 p≤X
log p
The inequality (5) follows. Finally, for every real number X ≥ 2, let
log p
T (X) = .
p
p≤X
Then it follows from (5) that there exists a positive absolute constant c6 such that |T (X) − log X| < c6
whenever X ≥ 2. On the other hand,
X
X
1 log p 1 dy T (X) T (y) dy
= + 2 = +
p≤X
p
p≤X
p log X p y log y log X 2 y log2 y
X
X
T (X) − log X (T (y) − log y) dy dy
= + 2 +1+ .
log X 2 y log y 2 y log y
2. For any arithmetic function f , we define f to be the arithmetic function given by f (n) = f (n) log n
for every n ∈ N. Then for the arithmetic function U defined by U (n) = 1 for every n ∈ N, we have
U (n) = log n and U (n) = log2 n for every n ∈ N.
(i) Suppose that f and g are arithmetic functions.
(I) Prove that (f + g) = f + g and (f ∗ g) = (f ∗ g) + (f ∗ g ).
(II) Suppose that f (1) = 0. By noting that (f ∗ f −1 ) (n) = 0 for every n ∈ N, prove that
(f −1 ) = −f ∗ (f ∗ f )−1 .
(ii) Explain why Λ ∗ U = U . Then establish Selberg’s identity Λ + (Λ ∗ Λ) = U ∗ µ.
1
−1
3. Prove that for every real number X ≥ 2, we have 1− > log X.
p
p≤X
5. Suppose that
• λn is an increasing sequence of real numbers with limit infinity;
• cn is an arbitrary sequence of real or complex numbers; and
• f has continuous derivative for X ≥ λ1 .
For every X ≥ λ1 , let
C(X) = cn .
λn ≤X
1
8. Show that the series converges as X → ∞.
p log p
p≤X
ELEMENTARY NUMBER THEORY
W W L CHEN
c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.
Chapter 7
GAUSS SUMS AND QUADRATIC RECIPROCITY
Recall the Law of quadratic reciprocity, that if p and q are distinct odd primes, then
q p p−1 q−1
= (−1)( 2 )( 2 ) .
p L q L
There are many proofs of this – Gauss alone discovered six. Our aim here, however, is to give a second
proof of this result, a proof discovered by Dirichlet and based on ideas from Fourier series.
Throughout this chapter, we use the notation that e(y) = e2πiy for every y ∈ R.
q 2
ax
S(q, a) = e
x=1
q
has many interesting properties, the first of which is a multiplicative property which simplifies its eval-
uation to cases when q is a prime power.
THEOREM 7A. Suppose that q1 , q2 ∈ N satisfy (q1 , q2 ) = 1. Suppose further that a ∈ Z satisfies
(a, q1 q2 ) = 1. Then
Proof. Since (q1 , q2 ) = 1, it follows from Theorem 3E that as x1 and x2 run through complete sets
of residues modulo q1 and q2 respectively, q2 x1 + q1 x2 runs through a complete set of residues modulo
q1 q2 . Hence
2
q
1 q2 q1 q2
az a(q2 x1 + q1 x2 )2
S(q1 q2 , a) = e = e .
z=1
q1 q2 x =1 x =1
q1 q2
1 2
Note now that (q2 x1 + q1 x2 )2 ≡ q22 x21 + q12 x22 (mod q1 q2 ). It follows that
q1 q2
aq2 x21
q2 q1
aq2 x21 aq1 x22 aq1 x22
S(q1 q2 , a) = e + = e e .
x =1 x =1
q1 q2 x =1
q1 x =1
q2
1 2 1 2
The Law of quadratic reciprocity can be deduced from Theorem 7A and the two results below.
THEOREM 7B. Suppose that p is an odd prime, and a ∈ Z satisfies (a, p) = 1. Then
a
(1) S(p, a) = S(p, 1).
p L
To deduce the Law of quadratic reciprocity, note that by Theorems 7A and 7B, we have, for distinct
primes p, q ∈ N, that
q p
S(pq, 1) = S(p, q)S(q, p) = S(p, 1) S(q, 1).
p L q L
Note now that the right hand side has value −1 if p ≡ q ≡ −1 (mod 4) and value 1 otherwise. The Law
of quadratic reciprocity follows.
Proof of Theorem 7B. Consider the congruence x2 ≡ n (mod p). Clearly the number of solutions
of this congruence is given by 1 + (n/p)L , so that
p 2 p p
ax n an n an
S(p, a) = e = 1+ e = e ,
x=1
p n=1
p L p n=1
p L p
since
p
an
e = 0.
n=1
p
Chapter 7 : Gauss Sums and Quadratic Reciprocity 7–3
We now make the substitution an ≡ m (mod p), and note that as n runs through a complete set of
residues modulo p, so does m. Hence, denoting by a−1 the natural number satisfying 1 ≤ a−1 < p and
aa−1 ≡ 1 (mod p), we have
p −1 −1 p p
a m m a m m a m m
(2) S(p, a) = e = e = e .
m=1
p L p p L m=1 p L p p L m=1 p L p
p
m m
(3) S(p, 1) = e .
m=1
p L p
To complete the proof of the Law of quadratic reciprocity, it remains to establish Theorem 7C,
which we shall do in Section 7.3. As the proof involves ideas concerning the convergence of Fourier
series, we shall first make a very brief study of this in the next section.
Suppose that a function f : R → C is Riemann integrable over the interval [0, 1] and is periodic with
period 1. We define the Fourier coefficient ch , for every h ∈ Z, by
1
ch = ch (f ) = f (y)e(−hy) dy.
0
Our task here is to obtain sufficient conditions for the Fourier series of a given function f to converge
to f , or at least some function closely related to f . The basic theorem in this study is the Riemann-
Lebesgue lemma.
Then I(λ, f ) → 0 as λ → ∞.
Proof. Our first task is to approximate f in [a, b] by a step function. Let > 0 be given. For any
sufficiently large k ∈ N, there exists a dissection
of [a, b] such that the upper sum S(f, ∆k ) and the lower sum s(f, ∆k ) satisfy
Clearly fk is a step function in, and hence Riemann integrable over, the interval [a, b]. Furthermore,
f (y) ≤ fk (y) for all y ∈ [a, b]. It follows immediately that fk − f is Riemann integrable over [a, b], and
that |fk (y) − f (y)| = fk (y) − f (y) for all y ∈ [a, b]. Hence
b b b
|I(λ, fk ) − I(λ, f )| = |fk (y) − f (y)| dy = (fk (y) − f (y)) dy = S(f, ∆k ) − f (y) dy < .
a a a
k
k
yj
eiλyj − eiλyj−1
I(λ, fk ) = fk (yj )e iλy
dy = fk (yj ) →0
j=1 yj−1 j=1
iλ
as λ → ∞, so that |I(λ, fk )| < for all sufficiently large λ. It follows that |I(λ, f )| < 2 for all sufficiently
large λ.
THEOREM 7E. Suppose that a function f : R → C is Riemann integrable over the interval [0, 1]
and is periodic with period 1. Let y ∈ R. Suppose that the limits
f (y±) = lim f (y + δ)
δ→0±
and
f (y + δ) − f (y±)
f± (y) = lim
δ→0± δ
are Riemann integrable over [y, y + 1/2] and [y − 1/2, y] respectively. Then
H
f (y+) + f (y−)
lim ch (f )e(hy) = .
H→∞ 2
h=−H
H
SH = ch (f )e(hy).
h=−H
Chapter 7 : Gauss Sums and Quadratic Reciprocity 7–5
Then
y+1/2
H
SH = f (u) e(h(y − u)) du.
y−1/2 h=−H
Note that the right hand side of (5) is continuous and Riemann integrable. It follows that SH = I1 + I2 ,
where
0
sin π(2H + 1)v
I1 = f (y + v) dv
−1/2 sin πv
and
1/2
sin π(2H + 1)v
I2 = f (y + v) dv.
0 sin πv
By Theorem 7D, the first two integrals on the right hand side both converge to 0 as H → ∞. The last
term is equal to
0 H
1 H
1 − (−1)h
f (y−) e(hv) dv = f (y−) + = 1 f (y−).
2 2πih 2
−1/2 h=−H h=−H
h=0
Similarly I2 → 12 f (y+) as H → ∞.
q−1
(n + θ)2
f (θ) = e ,
n=0
q
and note that S(q, 1) = f (0) = f (1). The function f has the Fourier series
∞
ch e(hy),
h=−∞
7–6 W W L Chen : Elementary Number Theory
N
N
(6) S(q, 1) = lim ch e(h0) = lim ch .
N →∞ N →∞
h=−N h=−N
2N
2N 1−h/2
2N 1−h/2
qh2
(7) ch = qe − e(qθ2 ) dθ = q e(qθ2 ) dθ
4 −h/2 −h/2
h=−2N h=−2N h=−2N
h even h even h even
N +1 ∞
=q e(qθ2 ) dθ → q e(qθ2 ) dθ = q 1/2 I,
−N −∞
where
∞
I= e(θ2 ) dθ.
−∞
If h is odd, then h2 ≡ 1 (mod 4), and so qh2 ≡ q (mod 4). It follows that
2N
2N 1−h/2 q
2N 1−h/2
qh2
(8) ch = qe − e(qθ ) dθ = qe −
2
e(qθ2 ) dθ
4 −h/2 4 −h/2
h=−2N h=−2N h=−2N
h odd h odd h odd
q N +1/2 q
= qe − e(qθ2 ) dθ → q 1/2 e − I as N → ∞.
4 −N +1/2 4
q 1/2 (1 + e(−q/4))
S(q, 1) = .
1−i
p. 29: Exercise 2.18. Both instances of “[±1]n ” in the hint should be replaced by “[±1]p .”
Actually, it would be better if the hint consisted of just the reference to Exercise 2.5,
and the exercise itself were moved to §2.2.
[VS, 10/12/2005]
p. 73: Last paragraph. Replace the second sentence by:
Using the theory of continued fractions, Theorem 4.6 can be improved as
follows: if n > 2r∗ t∗ (rather than n ≥ 4r∗ t∗ ), then statement (ii) of the
theorem holds when r0 is chosen as the first remainder ri ≤ r∗ , and s0 :=
si , t0 := ti . This fact was observed by Wang, Guy, and Davenport [97].
p. 89: Line 15. “(pe11 · · · perr )s ” should be “(pe11 · · · perr )−s .”
[VS, 5/17/2005]
p. 98: Exercise 6.1(a). Clarification: n is chosen at random from the set {2k−1 , . . . , 2k −1}.
[VS, 9/15/2005]
p. 99: Exercises 6.3 and 6.5. It may be better to add the hint: use induction on n.
[VS, 10/11/2005]
p. 107: Theorem 6.4. Add “where the distribution of each πi is Di .”
Also, the notation “D1 × · · · × Dn ” was never actually defined.
[VS, 9/17/2005]
p. 130: Line −5. “measure” should be “measure of.”
[VS, 5/18/2005]
p. 122: Line −9. “modulo n” should be “over Zn .”
[VS, 8/24/2005]
Pk0
should be “ ki=1
P 0
p.143: Second line in proof of Theorem 6.23. “ i=1 ai ” Pr[ai ].”
[VS, 10/5/2005]
p. 146: Last paragraph of §6.10.4. Rewrite as:
The definition of conditional expectation carries over verbatim. Equations (6.15) and
(6.16) hold (assuming the relevant expectations exist). Also, the analog of (6.16) holds
for infinite partitions B1 , B2 , . . . , provided E[X] exists.
[VS, 7/28/2005]
1
p. 171: Line −13. Insert “using” before “algorithm.”
[VS, 6/13/2005]
p. 194: Line 6 of Example 8.36. Replace both instances of “a” by “z” (not a typo, but it
would read better).
[VS, 5/20/2005]
p. 210: Lemma 8.46. “such that mi | mi+1 ” should read “such that mi | mi+1 and
ni | ni+1 .”
[VS, 3/30/2005]
p. 229: Equation (9.3). “Yj ” should be “Yj .”
[VS, 4/23/2005]
p. 230: Line 4. Add parenthetical remark just before semi-colon: “(with distinct exponent
sequences among the monomials).”
[VS, 4/23/2005]
p. 230: Line 7. “degree” should be “total degree.”
[VS, 4/23/2005]
p. 231: Line 3 of Example 9.32. “polynomial” should be “polynomials.”
[VS, 5/27/2005]
p. 271: Line −6. Before the semi-colon, insert: “with which we can perform both table
insertions and lookups in time O(len(p)).”
[VS, 9/22/2005]
p. 272: Line 16. Insert “of” after “table.”
[VS, 9/22/2005]
p. 279: Exercise 11.14. It might be better to ask the reader to give a rigorous proof,
assuming Conjecture 5.24, and assuming p is a random prime chosen between 3 and
Z with Z ≥ Y 3 .
[VS, 8/27/2005]
p. 331: Line −2. Replace “0V ” by “0W .”
[VS, 9/11/2005]
p. 332: Lines 4 and 7. Replace “0V ” by “0W .”
[VS, 9/11/2005]
p. 332: Line 7. Replace “equivalent to saying” to “implied by the condition.”
For the other direction, all one can say that if W contains a non-zero self-orthogonal
vector, then there exists a subspace U of W such that U ∩ Ū 6= {0W }.
[Ronald Cramer, 9/11/2005]
2
p. 441: 2nd to last para. Delete “It is easy to see that.”
[VS, 4/19/2005]
p. 477: Last paragraph. The claim that the running-time bound in Theorem 21.8 is tight
is incorrect (and in particular, Exercise 21.10 should be deleted).
In fact, assuming the gcd operation is implemented using Euclid’s algorithm, Algo-
rithm SFD uses O(`2 + `(w − 1) len(p)/p) operations in F . This follows from the
fact that on inputs a, b ∈ F [X] deg(a) ≥ deg(b) ≥ 0, Euclid’s algorithm uses only
O(len(b) len(a/d)) operations in F , where d := gcd(a, b) (this could be made an ex-
ercise for both the integer and polynomial cases). Combining this fact with Exer-
cise 21.24 will yield (with a careful counting argument) the better `2 bound for SFD,
instead of the more naive `3 bound. The algorithms in §21.6 are still useful, as the
output of these algorithms are in a nicer form.
[VS, 9/25/2005]
p. 482: Exercise 21.11. Add the following: “Assume that computing M1 (β) for β ∈
F [X]/(h) takes Ω(deg(h)2 len(q)) operations in F .”
[VS, 9/26/2005]