Download as pdf or txt
Download as pdf or txt
You are on page 1of 72

ELEMENTARY NUMBER THEORY

W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 1
DIVISION AND FACTORIZATION

1.1. Division

Suppose that a, b ∈ Z and a = 0. Then we say that a divides b, denoted by a | b, if there exists c ∈ Z
such that b = ac. In this case, we also say that a is a divisor of b, or that b is a multiple of a.

THEOREM 1A. Suppose that a ∈ N and b ∈ Z. Then there exist unique q, r ∈ Z such that b = aq + r
and 0 ≤ r < a.

Proof. We shall first of all show the existence of such numbers q, r ∈ Z. Consider the set

S = {b − as ≥ 0 : s ∈ Z}.

Then it is easy to see that S is a non-empty subset of N ∪ {0}. It follows from the Principle of induction
that S has a smallest element. Let r be the smallest element of S, and let q ∈ Z such that b − aq = r.
Clearly r ≥ 0, so it remains to show that r < a. Suppose on the contrary that r ≥ a. Then b − a(q + 1) =
(b − aq) − a = r − a ≥ 0, so that b − a(q + 1) ∈ S. Clearly b − a(q + 1) < r, contradicting that r is the
smallest element of S.

Next we show that such numbers q, r ∈ Z are unique. Suppose that

b = aq1 + r1 = aq2 + r2 .

Then

|r1 − r2 | = a|q2 − q1 |.

If q1 = q2 , then it is easy to see that a|q2 − q1 | ≥ a, while |r1 − r2 | < a, a contradiction. It follows that
q1 = q2 , and so r1 = r2 also. 
1–2 W W L Chen : Elementary Number Theory

We next establish the existence of the greatest common divisor.

THEOREM 1B. Suppose that a, b ∈ N. Then there exists a unique d ∈ N such that
(i) there exist x, y ∈ Z such that d = ax + by;
(ii) d | a and d | b; and
(iii) for every k ∈ N such that k | a and k | b, we have k | d.

Proof. Consider the set

I = {au + bv : u, v ∈ Z}.

Then it is easy to see that I is a non-empty subset of Z which contains some positive integers. It follows
from the Principle of induction that I has a least positive element. Let d be the least positive element
of I, and let x, y ∈ Z such that d = ax + by. The conclusion (i) follows trivially. Also, d is uniquely
defined.

Next, we shall show that d divides every integer in I. Suppose that z = au + bv is any given integer
in I. By Theorem 1A, there exist q, r ∈ Z such that z = dq + r, where 0 ≤ r < d. Then

r = z − dq = a(u − xq) + b(v − yq) ∈ I.

If r = 0, then the requirement 0 < r < d contradicts the minimality of d. Hence r = 0, so that z = dq,
whence d divides z.

Taking u = 1 and v = 0 gives d | a. Taking u = 0 and v = 1 gives d | b. Also, the conclusion (iii) is
a simple consequence of (i). 

The number d in Theorem 1B is called the greatest common divisor of a and b, and denoted by
d = (a, b). Two numbers a, b ∈ N are said to be relatively prime, or coprime, if (a, b) = 1.

A practical way of finding the greatest common divisor of two natural numbers is given by the
following result.

THEOREM 1C. (EUCLID’S ALGORITHM) Suppose that a, b ∈ N, and a < b. Suppose further
that q1 , . . . , qn+1 ∈ Z and r1 , . . . , rn ∈ N satisfy 0 < rn < rn−1 < . . . < r1 < a and

b = aq1 + r1 ,
a = r1 q 2 + r2 ,
r1 = r2 q 3 + r3 ,
..
.
rn−2 = rn−1 qn + rn ,
rn−1 = rn qn+1 .

Then (a, b) = rn .

Proof. We shall first of all prove that

(1) (a, b) = (a, r1 ).

Note that we have (a, b) | a and (a, b) | (b − aq1 ) = r1 , and so

(a, b) | (a, r1 ).

On the other hand, we have (a, r1 ) | a and (a, r1 ) | (aq1 + r1 ) = b, and so

(a, r1 ) | (a, b).


Chapter 1 : Division and Factorization 1–3

Equality (1) follows. Similarly

(2) (a, r1 ) = (r1 , r2 ) = (r2 , r3 ) = . . . = (rn−1 , rn ).

Note now that

(3) (rn−1 , rn ) = (rn qn+1 , rn ) = rn .

The result follows on combining (1)–(3). 

Example. Consider (589, 5111). In our notation, we let a = 589 and b = 5111. Then we have

5111 = 589 · 8 + 399,


589 = 399 · 1 + 190,
399 = 190 · 2 + 19,
190 = 19 · 10.

It follows that (589, 5111) = 19. On the other hand,

19 = 399 − 190 · 2
= 399 − (589 − 399 · 1) · 2
= 589 · (−2) + 399 · 3
= 589 · (−2) + (5111 − 589 · 8) · 3
= 5111 · 3 + 589 · (−26).

It follows that x = −26 and y = 3 satisfy 589x + 5111y = (589, 5111).

A very useful result concerning divisors is the following.

THEOREM 1D. Suppose that a, b ∈ N and (a, b) = 1. Suppose further that w ∈ N satisfies w | ab.
Then there exist unique u, v ∈ N such that u | a, v | b and w = uv.

Proof. We shall first of all show that u = (w, a) and v = (w, b) satisfy the requirements. Consider
the number (w, a)(w, b). By Theorem 1B, there exist x1 , y1 , x2 , y2 ∈ Z such that (w, a) = wx1 + ay1 and
(w, b) = wx2 + by2 , so that

(w, a)(w, b) = (wx1 + ay1 )(wx2 + by2 ) = w(wx1 x2 + ay1 x2 + bx1 y2 ) + aby1 y2 .

It follows that

(4) w | (w, a)(w, b).

On the other hand, since (a, b) = 1, it follows from Theorem 1B that there exist x, y ∈ Z such that
ax + by = 1, so that w = wax + wby. Note now that (w, a) | a and (w, b) | w, so that (w, a)(w, b) | wax.
Note also that (w, a) | w and (w, b) | b, so that (w, a)(w, b) | wby. It follows that

(5) (w, a)(w, b) | w.

Combining (4) and (5), and noting that w, (w, a), (w, b) ∈ N, we conclude that w = (w, a)(w, b).

To show uniqueness, it suffices to show that if u, v ∈ N satisfy u | a, v | b and w = uv, then


u = (w, a) and v = (w, b). Since u | w and u | a, it follows from Theorem 1B that u | (w, a). Similarly
v | (w, b). Suppose on the contrary that u = (w, a). Then u < (w, a), so that w = uv < (w, a)(w, b) = w,
a contradiction. A similar contradiction arises if v = (w, b). 
1–4 W W L Chen : Elementary Number Theory

1.2. Factorization

Suppose that a ∈ N and a > 1. Then we say that a is prime if it has exactly two positive divisors,
namely 1 and a. We also say that a is composite if it is not prime. It is convenient to treat the integer
1 as neither prime nor composite. To find a good reason for not including 1 as a prime, see the Remark
following Theorem 1G.

Throughout these notes, the symbol p, with or without suffices, denotes a (positive) prime.

THEOREM 1E. Suppose that a, b ∈ Z, and p is a prime. If p | ab, then p | a or p | b.

Proof. Suppose that p  a. Since p is prime, the only positive divisors of p are 1 and p. Hence we
must have (a, p) = 1. It follows from Theorem 1B that there exist x, y ∈ Z such that 1 = ax + py, so
that b = abx + pby. Clearly p | b. 

Using Theorem 1E a finite number of times, we have the following generalization.

THEOREM 1F. Suppose that a1 , . . . , ak ∈ Z, and p is a prime. If p | a1 . . . ak , then p | aj for some


j = 1, . . . , k.

THEOREM 1G. (FUNDAMENTAL THEOREM OF ARITHMETIC) Every natural number n > 1


is representable as a product of primes, uniquely up to the order of factors.

Remark. If the integer 1 were included as a prime, then we would have to rephrase the statement of
the Fundamental theorem of arithmetic to allow for different representations like 6 = 2 · 3 = 1 · 2 · 3.

Proof of Theorem 1G. We shall first of all show by induction that every integer n ≥ 2 is repre-
sentable as a product of primes. Clearly 2 is a product of primes. Assume now that n > 2 and that
every m ∈ N satisfying 2 ≤ m < n is representable as a product of primes. If n is a prime, then it is
obviously representable as a product of primes. If n is not a prime, then there exist n1 , n2 ∈ N satisfying
2 ≤ n1 < n and 2 ≤ n2 < n such that n = n1 n2 . By our induction hypothesis, both n1 and n2 are
representable as products of primes, so that n must also be representable as a product of primes.

Next we shall show uniqueness. Suppose that

(6) n = p1 . . . pr = p1 . . . ps ,

where p1 ≤ . . . ≤ pr and p1 ≤ . . . ≤ ps are primes. Now p1 | p1 . . . ps , so it follows from Theorem 1F
that p1 | pj for some j = 1, . . . , s. Since p1 and pj are both primes, we must then have p1 = pj . On the
other hand, p1 | p1 . . . pr , so again it follows from Theorem 1F that p1 | pi for some i = 1, . . . , r, so again
we must have p1 = pi . It now follows that p1 = pj ≥ p1 = pi ≥ p1 , so that p1 = p1 . Removing the factor
p1 = p1 from (1), we obtain

p2 . . . pr = p2 . . . ps .

Repeating this argument a finite number of times, we conclude that r = s and pi = pi for every
i = 1, . . . , r. 

Grouping together equal primes, we can reformulate Theorem 1G as follows.

THEOREM 1H. Every natural number n > 1 is representable uniquely in the form

(7) n = pm mr
1 . . . pr ,
1

where p1 < . . . < pr are primes, and where mj ∈ N for every j = 1, . . . , r.

The representation (7) is called the canonical decomposition of the natural number n.
Chapter 1 : Division and Factorization 1–5

1.3. Some Elementary Properties of Primes

There are many consequences of the Fundamental theorem of arithmetic. The following is one which
concerns primes.

THEOREM 1J. (EUCLID) There are infinitely many primes.

Proof. Suppose on the contrary that p1 < . . . < pr are all the primes. Let

n = p1 . . . pr + 1.

Then n ∈ N and n > 1. It follows from the Fundamental theorem of arithmetic that pj | n for some
j = 1, . . . , r, so that pj | (n − p1 . . . pr ) = 1, a contradiction. 

Let p be a prime. For any given n ∈ N, it is an interesting problem to find the largest integer k
such that pk | n!. In order to describe the answer to this question, we need to define one of the most
useful functions in number theory.

Suppose that α ∈ R. The number [α] ∈ Z is defined to be the unique integer m ∈ Z satisfying
m ≤ α < m + 1. We call [α] the integer part of α.

Examples. We have [π] = 3, [5] = 5 and [−π] = −4.

The integer part function has many interesting properties. The proof of the following results is left
as an exercise.

Remarks. Suppose that α, β ∈ R.

(i) We have α − 1 < [α] ≤ α and 0 ≤ α − [α] < 1.

(ii) If α ≥ 0, then [α] counts the number of natural numbers not exceeding α. In other words,

[α] = 1.
1≤n≤α

(iii) For every n ∈ Z, we have [α + n] = [α] + n.

(iv) We have [α] + [β] ≤ [α + β] ≤ [α] + [β] + 1.

(v) If α ∈ Z, then [α] + [−α] = 0. If α ∈ Z, then [α] + [−α] = −1.

(vi) The number −[−α] is the smallest integer not less than α.

(vii) If n ∈ N, then [[α]/n] = [α/n].

(viii) The number [α+1/2] is one of the two nearest integers to α. Furthermore, if these two integers
both differ from α by the same value, then [α + 1/2] is the larger of these two integers.

(ix) If α > 0 and n ∈ N, then [α/n] is the number of positive integers not exceeding α and which
are multiples of n.

THEOREM 1K. Suppose that n ∈ N and p is a prime. Then the largest integer k such that pk | n!
is given by
∞ 
 
n
k= .
j=1
pj
1–6 W W L Chen : Elementary Number Theory

Proof. Suppose that m ∈ N and 1 ≤ m ≤ n. If pr | m and pr+1  m, we want to count a contribution


of r. In other words, we count a contribution of 1 for every j ∈ N such that pj | m. Hence

 ∞
n  ∞ 
 n ∞ 
 
n
k= 1= 1= ,
m=1 j=1 j=1 m=1 j=1
pj
pj |m pj |m

in view of Remark (ix) above. 

If m ∈ N and p is prime, we sometimes write pr  m if pr | m and pr+1  m.

Example. Suppose that 3k  150!. Then


         
150 150 150 150 150
k= + + + + + ...
3 32 33 34 35
= 50 + 16 + 5 + 1 + 0 + . . .
= 72.

1.4. Some Results and Problems Concerning Primes

Given that there are infinitely many primes, a natural question that arises is to determine the number
π(X) of primes that do not exceed a given real number X. This was the subject of much investigation
in the 1800’s. For example, Legendre proposed in 1808 that there is a constant A such that for large
values of X, the number π(X) can be approximated by
X
(8) .
log X − A
Gauss proposed the function
1
log x
as an approximation to the average density of distribution of primes near any large real number x, and
thus formulated the function
 X
dx
(9)
2 log x

as an approximation to π(X). Note that the dominating term in the integral is


X
(10) ,
log X
so perhaps

π(X) log X
(11) lim = 1.
X→∞ X

Indeed, Tchebycheff showed in 1848 that if the limit in (11) exists at all, then it must be equal to 1.
Unfortunately, he and others were unable to show that the limit exists. Then in 1850, he showed that
there exist positive constants c1 and c2 such that for every real number X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X
Chapter 1 : Division and Factorization 1–7

This confirms that the function (10) at least represents the correct order of magnitude of π(X). We
shall prove Tchebycheff’s theorem in Chapter 6.

The crucial idea that finally led to the proof of (11) was introduced by Riemann in a monumental
contribution in 1860. Riemann observed that the series
∞
1
(12) s
n=1
n

plays a crucial role in the study of the distribution of primes if one treats s as a complex variable. It
follows that the distribution of primes can be studied by the use of methods in the theory of analytic
functions. Riemann denoted the series (12) by ζ(s), and the function has since been known as the
Riemann zeta function. Indeed, Riemann’s work has also influenced greatly the development of the
general theory of functions.

Riemann’s ideas were studied in great depth in the late 1800’s by von Mangoldt and Hadamard.
This culminated in the proof of (11) in 1896 by Hadamard and de la Vallée Poussin, independently and
almost simultaneously. In particular, the work of de la Vallée Poussin showed that the integral (9) is a
better approximation to π(X) than the function (8) for any value of the constant A.

The result (11) is known nowadays as the Prime number theorem.

Problems for Chapter 1

1. Suppose that a, b, c ∈ N. Prove each of the following:


(i) If a | b and b | c, then a | c.
(ii) If a | b and a | c, then a | (bx + cy) for every x, y ∈ Z.

2. Prove that if n ∈ N is composite, then it has a prime factor not exceeding n.

3. Prove that n4 + 4 is composite for every natural number n > 1.

4. Prove that the three natural numbers n, n + 2, n + 4 cannot be simultaneously prime unless n = 3.

5. Suppose that p > 3 is a prime.


(i) Explain why p = 6k + 1 or p = 6k − 1 for some k ∈ N.
(ii) Use this to show that 24 | (p2 − 1).

6. Prove that 24 | n(n2 − 1) for every odd n ∈ N.

7. Prove that for every natural number n > 2, at least one of 2n − 1 and 2n + 1 is composite.

8. Suppose that a, b, c ∈ N.
(i) Prove that if 3 | (a2 + b2 ), then 3 | ab.
(ii) Prove that if 9 | (a3 + b3 + c3 ), then 3 | abc.

9. Suppose that m, n ∈ N.
(i) Prove that n! | (m + 1)(m + 2) . . . (m + n).
(ii) Prove that 6 | (n3 − n) and 120 | (n5 − 5n3 + 4n).
(3m)!(4n)!
(iii) Prove that ∈ N.
(m!)3 (n!)4

10. Suppose that n1 , . . . , nk ∈ N. Prove that n1 ! . . . nk ! divides (n1 + . . . + nk )!.


1–8 W W L Chen : Elementary Number Theory

11. Suppose that p is a prime.


 
p
(i) Prove that is divisible by p for every k = 1, 2, . . . , p − 1.
k
(ii) Prove that 2p − 2 is divisible by p.

12. Suppose that n ∈ N and 2n + 1 is prime. Prove that n is a power of 2.


n
13. The Fermat numbers Fn are defined by Fn = 22 + 1 for every non-negative integer n.
(i) Prove that for every k ∈ N, we have Fn | (Fn+k − 2).
(ii) Deduce that the Fermat numbers are pairwise coprime.
(iii) Explain why this implies that there are infinitely many primes.

14. A rational number a/b with (a, b) = 1 is called a reduced fraction. If a sum of two reduced fractions
is an integer, say (a/b) + (c/d) ∈ Z, prove that |b| = |d|.

15. Suppose that a, b, c, d ∈ N. Prove each of the following without using prime factorizations:
(i) If a | bc and (a, b) = 1, then a | c.
(ii) (a, b) = d if and only if (a/d, b/d) = 1.
(iii) (ac, bc) = c(a, b).
(iv) (a, bc) = (a, (a, b)c).
(v) (a2 , b2 ) = (a, b)2 .

16. Suppose that a, b, c, d, x, y ∈ N, where ad − bc = ±1. Prove that if m = ax + by and n = cx + dy,


then (m, n) = (x, y).

17. Suppose that a, b ∈ N.


(i) Prove that there exists a unique m ∈ N such that
• a | m and b | m; and
• if x ∈ N satisfies a | x and b | x, then m | x.
(ii) We write m = [a, b] and call it the least common multiple of a and b. Describe [a, b] in terms
of canonical decompositions.

18. Suppose that a, b, c ∈ N.


(i) Describe the greatest common divisor (a, b, c) and the least common multiple [a, b, c] in terms
of canonical decompositions.
[a, b, c]2 (a, b, c)2
(ii) Show that = .
[a, b][a, c][b, c] (a, b)(a, c)(b, c)

19. Suppose that a, m, n ∈ N and a = 1. Prove that (am − 1, an − 1) = a(m,n) − 1.

20. Find (589, 5111). Find also integers x and y such that (589, 5111) = 589x + 5111y. Hence give the
general solution of this equation in integers x and y.

21. Prove that there are infinitely many primes of the form 4n − 1.
   
m+n n−m+1
22. Prove that + = n for every m, n ∈ Z.
2 2

23. Suppose that a, b ∈ N satisfy (a, b) = 1. Prove that


  kb  
a−1 b−1 
a

(a − 1)(b − 1)
= =
a b 2
k=1 =1

by considering the open rectangle with vertices at (0, 0), (a, 0), (0, b) and (a, b), split into two halves
by the line segment ay = bx where 0 < x < a. Show first that there are no lattice points (m, n) ∈ N2
on the line segment between the endpoints (0, 0) and (a, b). Then count lattice points (m, n) ∈ N2
in one of the two open triangular regions.
Chapter 1 : Division and Factorization 1–9

24. Suppose that n ∈ N, and that α ∈ R is non-negative. Prove Hermite’s identity, that


n−1
k

α+ = [nα].
n
k=0

25. Suppose that n ∈ N. Prove that


∞ 
 
n + 2k
=n
2k+1
k=0

by using the binary digit expansion




n= am 2m , where a0 , a1 , a2 . . . ∈ {0, 1},
m=0

noting that am = 0 for finitely many non-negative integers m.

26. Prove that 2n | (n + 1)(n + 2) . . . (2n) for every n ∈ N.

27. With how many zeros does the decimal digit expansion of 2003! end?
 
200
28. Find the largest two-digit prime factor of .
100
 
√ 2n
29. Suppose n ∈ N, and that p ≥ 2n is a prime such that p divides . Prove that p2 does not
  n
2n
divide .
n
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 2
ARITHMETIC FUNCTIONS

2.1. Introduction

By an arithmetic function, we mean a function of the form f : N → C. We say that an arithmetic


function f : N → C is multiplicative if f (mn) = f (m)f (n) whenever m, n ∈ N and (m, n) = 1.

Example. The function U : N → C, defined by U (n) = 1 for every n ∈ N, is an arithmetic function.


Furthermore, it is multiplicative.

THEOREM 2A. Suppose that the function f : N → C is multiplicative. Then the function g : N → C,
defined by

g(n) = f (m)
m|n

for every n ∈ N, is multiplicative.



Here the summation m|n denotes a sum over all positive divisors m of n.

Proof of Theorem 2A. Suppose that a, b ∈ N and (a, b) = 1. If u is a positive divisor of a and v
is a positive divisor of b, then clearly uv is a positive divisor of ab. On the other hand, by Theorem
1D, every positive divisor m of ab can be expressed uniquely in the form m = uv, where u is a positive
divisor of a and v is a positive divisor of b. Clearly (u, v) = 1. It follows that
  
    
g(ab) = f (m) = f (uv) = f (u)f (v) =  f (u)  f (v) = g(a)g(b).
m|ab u|a v|b u|a v|b u|a v|b

This completes the proof. 


2–2 W W L Chen : Elementary Number Theory

2.2. The Divisor Function

We define the divisor function d : N → C by writing



(1) d(n) = 1
m|n

for every n ∈ N. Here the sum is taken over all positive divisors m of n. In other words, the value
d(n) denotes the number of positive divisors of the natural number n. On the other hand, we define the
function σ : N → C by writing

(2) σ(n) = m
m|n

for every n ∈ N. Clearly, the value σ(n) denotes the sum of all the positive divisors of the natural
number n.

THEOREM 2B. Suppose that n ∈ N and that n = pu1 1 . . . pur r is the canonical decomposition of n.
Then

pu1 1 +1 − 1 pur +1 − 1
d(n) = (1 + u1 ) . . . (1 + ur ) and σ(n) = ... r .
p1 − 1 pr − 1

Proof. Every positive divisor m of n is of the form m = pv11 . . . pvrr , where for every j = 1, . . . , r, the
integer vj satisfies 0 ≤ vj ≤ uj . It follows from (1) that d(n) is the number of choices for the r-tuple
(v1 , . . . , vr ). Hence


u1 
ur
d(n) = ... 1 = (1 + u1 ) . . . (1 + ur ).
v1 =0 vr =0

On the other hand, it follows from (2) that


   

u1 
ur 
u1 
ur
σ(n) = ... pv11 . . . pvrr = pv11 ... pvrr .
v1 =0 vr =0 v1 =0 vr =0

Note now that for every j = 1, . . . , r, we have


uj
v u
u +1
pj j − 1
pj j = 1 + pj + p2j + . . . + pj j = .
vj =0
pj − 1

The second assertion follows. 

The result below is a simple deduction from Theorem 2B.

THEOREM 2C. The arithmetic functions d : N → C and σ : N → C are both multiplicative.

Natural numbers n ∈ N where σ(n) = 2n are of particular interest, and are known as perfect
numbers. A perfect number is therefore a natural number which is equal to the sum of its own proper
divisors; in other words, the sum of all its positive divisors other than itself.

Examples. It is easy to see that 6 = 1 + 2 + 3 and 28 = 1 + 2 + 4 + 7 + 14 are perfect numbers.

It is not known whether any odd perfect number exists. However, we can classify the even perfect
numbers.
Chapter 2 : Arithmetic Functions 2–3

THEOREM 2D. (EUCLID-EULER) Suppose that m ∈ N. If 2m − 1 is a prime, then the number


2m−1 (2m − 1) is an even perfect number. Furthermore, there are no other even perfect numbers.

Proof. Suppose that n = 2m−1 (2m − 1), and 2m − 1 is prime. Clearly

(2m−1 , 2m − 1) = 1.

It follows from Theorems 2B and 2C that


2m − 1 m
σ(n) = σ(2m−1 )σ(2m − 1) = 2 = 2n,
2−1

so that n is a perfect number, clearly even since m ≥ 2.

Suppose now that n ∈ N is an even perfect number. Then we can write n = 2m−1 u, where m ∈ N
and m > 1, and where u ∈ N is odd. By Theorem 2B, we have

2m u = σ(n) = σ(2m−1 )σ(u) = (2m − 1)σ(u),

so that
2m u u
(3) σ(u) = =u+ m .
2m −1 2 −1

Note that σ(u) and u are integers and σ(u) > u. Hence u/(2m − 1) ∈ N and is a divisor of u. Since
m > 1, we have 2m − 1 > 1, and so u/(2m − 1) = u. It now follows from (3) that σ(u) is equal to the
sum of two of its positive divisors. But σ(u) is equal to the sum of all its positive divisors. Hence u
must have exactly two positive divisors, so that u is prime. Furthermore, we must have u/(2m − 1) = 1,
so that u = 2m − 1. 

We are interested in the behaviour of d(n) and σ(n) as n → ∞. If n ∈ N is a prime, then clearly
d(n) = 2. Also, the magnitude of d(n) is sometimes greater than that of any power of log n. More
precisely, we have the following result.
c
THEOREM 2E. For any fixed real number c > 0, the inequality d(n)  (log n) as n → ∞ does not
hold.

Proof. The idea of the proof is to consider integers which are divisible by many different primes.
Suppose that c > 0 is given and fixed. Let  ∈ N ∪ {0} satisfy  ≤ c <  + 1. For every j = 1, 2, 3, . . . ,
let pj denote the j-th positive prime in increasing order of magnitude, and consider the integer
m
n = (p1 . . . p+1 ) .

Then in view of Theorem 2B, we have



+1
+1 log n +1 c
(4) d(n) = (m + 1) > > K(c)(log n) > K(c)(log n) ,
log(p1 . . . p+1 )

where the positive constant



+1
1
K(c) =
log(p1 . . . p+1 )

depends only on c. The result follows on noting that the inequality (4) holds for every m ∈ N. 

On the other hand, the order of magnitude of d(n) cannot be too large either.
2–4 W W L Chen : Elementary Number Theory

THEOREM 2F. For any fixed real number  > 0, we have d(n)  n as n → ∞.

Proof. For every natural number n > 1, let n = pu1 1 . . . pur r be its canonical decomposition. It follows
from Theorem 2B that

d(n) (1 + u1 ) (1 + ur )
= ... .
n pu
1
1
pu
r
r

We may assume without loss of generality that  < 1. If 2 ≤ pj < 21/ , then
uj
pj ≥ 2uj = euj log 2 > 1 + uj log 2 > (1 + uj ) log 2,

so that

(1 + uj ) 1
uj < .
pj  log 2

On the other hand, if pj ≥ 21/ , then pj ≥ 2, and so

(1 + uj ) 1 + uj
u ≤ ≤ 1.
pj j 2u j

It follows that

d(n) 1

< ,
n 1/
 log 2
p<2

a positive constant depending only on . 

We see from Theorems 2E and 2F and the fact that d(n) = 2 infinitely often that the magnitude of
d(n) fluctuates a great deal as n → ∞. It may then be more fruitful to average the function d(n) over a
range of values of n, and consider, for positive real numbers X ∈ R, the value of the average

1 
d(n).
X
n≤X

THEOREM 2G. (DIRICHLET) As X → ∞, we have



d(n) = X log X + (2γ − 1)X + O(X 1/2 ).
n≤X

Here γ is Euler’s constant and is defined by


 
 1
γ = lim  − log Y  = 0.5772156649 . . . .
Y →∞ n
n≤Y

Remark. It is an open problem in mathematics to determine whether Euler’s constant γ is rational


or irrational.

The proof of Theorem 2G depends on the following intermediate result.


Chapter 2 : Arithmetic Functions 2–5

THEOREM 2H. As Y → ∞, we have


 1

1
= log Y + γ + O .
n Y
n≤Y

Proof. As Y → ∞, we have
   
 1  Y  Y Y 
1 1 [Y ] 1 [Y ] 1 
= + du +
= du = + 1 du
n Y n u2 Y u 2 Y 1 u2
n≤Y n≤Y n≤Y n n≤u
Y Y Y
[Y ] [u] [Y ] 1 u − [u]
= + 2
du = + du − du
Y u Y u u2
1

∞1 1

1 u − [u] u − [u]
= log Y + 1 + O − 2
du + du
Y 1 u Y u2

u − [u] 1
= log Y + 1 − 2
du + O .
1 u Y

It is a simple exercise to show that



u − [u]
1− du = γ.
1 u2

and this completes the proof. 

Proof of Theorem 2G. As X → ∞, we have


       
d(n) = 1= 1+ 1− 1
n≤X x,y x≤X 1/2 y≤ X y≤X 1/2 x≤ X x≤X 1/2 y≤X 1/2
xy≤X x y

 X  2  X 2
=2 − [X 1/2 ] = 2 + O(X 1/2 ) − (X 1/2 + O(1))
x x
x≤X 1/2 x≤X 1/2

1
= 2X log X 1/2
+γ+O + O(X 1/2 ) − X
X 1/2
= X log X + (2γ − 1)X + O(X 1/2 ).

This completes the proof. 

We next turn our attention to the study of the behaviour of σ(n) as n → ∞. Every number n ∈ N
has divisors 1 and n, so we must have σ(1) = 1 and σ(n) > n if n > 1. On the other hand, it follows
from Theorem 2F that for any fixed real number  > 0, we have

σ(n) ≤ nd(n)  n1+ as n → ∞.

In fact, it is rather easy to prove a slightly stronger result.

THEOREM 2J. We have σ(n)  n log n as n → ∞.

Proof. As n → ∞, we have
 n  1
σ(n) = ≤n  n log n.
m m
m|n m≤n

This completes the proof. 


2–6 W W L Chen : Elementary Number Theory

As in the case of d(n), the magnitude of σ(n) fluctuates a great deal as n → ∞. As before, we shall
average the function σ(n) over a range of values of n, and consider some average value of the function.
Corresponding to Theorem 2G, we have the following result.

THEOREM 2K. As X → ∞, we have

 π2 2
σ(n) = X + O(X log X).
12
n≤X

Proof. As X → ∞, we have

  n   n    1 X  

X
σ(n) = = = r= 1+
m m 2 m m
n≤X n≤X m|n m≤X n≤X m≤X r≤X/m m≤X
m|n
   

2 2   
1 X X 1 1
= + O(1) = + O X +O 1
2 m 2 m2 m
m≤X m≤X m≤X m≤X

 
X2  1 2
 1 π2 2
= 2
+O X 2
+ O(X log X) = X + O(X log X).
2 m=1 m m 12
m>X

This completes the proof. 

2.3. The Möbius Function

We define the Möbius function µ : N → C by writing



1 if n = 1,
µ(n) = (−1)r if n = p1 . . . pr , a product of distinct primes,
0 otherwise.

Remarks. (i) A natural number which is not divisible by the square of any prime is called a squarefree
number. Note that 1 is both a square and a squarefree number. Furthermore, a number n ∈ N is
squarefree if and only if µ(n) = ±1.

(ii) The motivation for the definition of the Möbius function lies rather deep. To understand the
definition, one needs to study the Riemann zeta function, an important function in the study of the
distribution of primes. At this point, it suffices to remark that the Möbius function is defined so that if
we formally multiply the two series

∞ ∞
1 µ(n)
s
and ,
n=1
n n=1
ns

where s ∈ C denotes a complex variable, then the product is identically equal to 1. Heuristically, note
that
∞  ∞   
 1  µ(m) ∞ ∞  ∞ ∞ 
µ(m)  1
= = µ(m) s .
ks m=1
m s
n=1 m=1
n s
n=1
n
k=1 k=1 m|n
km=n
Chapter 2 : Arithmetic Functions 2–7

It follows that the product is identically equal to 1 if


 
1 if n = 1,
µ(m) =
0 if n > 1.
m|n

We shall establish this last fact and study some of its consequences over the next four theorems.

THEOREM 2L. The Möbius function µ : N → C is multiplicative.

Proof. Suppose that a, b ∈ N and (a, b) = 1. If a or b is not squarefree, then neither is ab, and so
µ(ab) = 0 = µ(a)µ(b). On the other hand, if both a and b are squarefree, then since (a, b) = 1, ab must
also be squarefree. Furthermore, the number of prime factors of ab must be the sum of the numbers of
prime factors of a and of b. 

THEOREM 2M. Suppose that n ∈ N. Then


 
1 if n = 1,
µ(m) =
0 if n > 1.
m|n

Proof. Consider the function f : N → C defined by writing



f (n) = µ(m)
m|n

for every n ∈ N. It follows from Theorems 2A and 2L that f is multiplicative. For n = 1, the result is
trivial. To complete the proof, it therefore suffices to show that f (pk ) = 0 for every prime p and every
k ∈ N. Indeed,

f (pk ) = µ(m) = µ(1) + µ(p) + µ(p2 ) + . . . + µ(pk ) = 1 − 1 + 0 + . . . + 0 = 0.
m|pk

This completes the proof. 

Theorem 2M plays the central role in the proof of the following two results which are similar in
nature.

THEOREM 2N. (MÖBIUS INVERSION FORMULA) For any function f : N → C, if the function
g : N → C is defined by writing

g(n) = f (m)
m|n

for every n ∈ N, then for every n ∈ N, we have


 n  n
f (n) = µ(m) g = µ g(m).
m m
m|n m|n

Proof. The second equality is obvious. Also


   
 n     
 
µ(m) g = µ(m)  f (k) = µ(m)f (k) = f (k)  µ(m) = f (n),
m n
m|n m|n k| m k,m k|n m| n
km|n k

in view of Theorem 2M. 


2–8 W W L Chen : Elementary Number Theory

THEOREM 2P. For any function g : N → C, if the function f : N → C is defined by writing


 n
f (n) = µ g(m)
m
m|n

for every n ∈ N, then for every n ∈ N, we have


  n
g(n) = f (m) = f .
m
m|n m|n

Proof. The second equality is obvious. Also


     
 n    n   
 
  n/k   
f = µ g(k) = g(k)  µ = g(k)  µ(m) = g(n),
m n mk n m n
m|n m|n k| m k|n m| k|n m|
k k

in view of Theorem 2M. 

Remark. In number theory, it occurs quite often that in the proof of a theorem, a change of order of
summation of the variables is required, as illustrated in the proofs of Theorems 2N and 2P. This process
of changing the order of summation does not depend on the summand in question. In both instances,
we are concerned with a sum of the form

A(k, m).
n
m|n k| m

This means that for every positive divisor m of n, we first sum the function A over all positive divisors
k of n/m to obtain the sum

A(k, m),
n
k| m

which is a function of m. We then sum this sum over all divisors m of n. Now observe that for every
natural number k satisfying k | n/m for some positive divisor m of n, we must have k | n. Consider
therefore a particular natural number k satisfying k | n. We must find all natural numbers m satisfying
the original summation conditions, namely m | n and k | n/m. These are precisely those natural numbers
m satisfying m | n/k. We therefore obtain, for every positive divisor k of n, the sum

A(k, m).
m| n
k

Summing over all positive divisors k of n, we obtain



A(k, m).
k|n m| n
k

Since we are summing the function A over the same collection of pairs (k, m), and have merely changed
the order of summation, we must have
 
A(k, m) = A(k, m).
n
m|n k| m k|n m| n
k
Chapter 2 : Arithmetic Functions 2–9

2.4. The Euler Function

We define the Euler function φ : N → C as follows. For every n ∈ N, we let φ(n) denote the number of
elements in the set {1, 2, . . . , n} which are coprime to n.

THEOREM 2Q. For every number n ∈ N, we have



φ(m) = n.
m|n

Proof. We shall partition the set {1, 2, . . . , n} into d(n) disjoint subsets Bm , where for every positive
divisor m of n,

Bm = {x : 1 ≤ x ≤ n and (x, n) = m}.

If x ∈ Bm , let x = mx . Then (mx , n) = m if and only if (x , n/m) = 1. Also 1 ≤ x ≤ n if and only if
1 ≤ x ≤ n/m. Hence

Bm = {x : 1 ≤ x ≤ n/m and (x , n/m) = 1}

has the same number of elements as Bm . Note now that the number of elements of Bm is exactly φ(n/m).
Since every element of the set {1, 2, . . . , n} falls into exactly one of the subsets Bm , we must have
 n 
n= φ = φ(m).
m
m|n m|n

This completes the proof. 

Applying the Möbius inversion formula to the conclusion of Theorem 2Q, we obtain immediately
the following result.

THEOREM 2R. For every number n ∈ N, we have


 n  µ(m)
φ(n) = µ(m) =n .
m m
m|n m|n

THEOREM 2S. The Euler function φ : N → C is multiplicative.

Proof. Since the Möbius function µ is multiplicative, it follows that the function f : N → C, defined
by f (n) = µ(n)/n for every n ∈ N, is multiplicative. The result now follows from Theorem 2A. 

THEOREM 2T. Suppose that n ∈ N and n > 1, with canonical decomposition n = pu1 1 . . . pur r . Then
r


r
1 u −1
φ(n) = n 1− = pj j (pj − 1).
j=1
pj j=1

Proof. The second equality is trivial. On the other hand, for every prime p and every u ∈ N, we have
by Theorem 2R that

φ(pu )  µ(m) µ(p) 1


u
= =1+ =1− .
p u
m p p
m|p

The result now follows since φ is multiplicative. 


2–10 W W L Chen : Elementary Number Theory

We now study the magnitude of φ(n) as n → ∞. Clearly φ(1) = 1 and φ(n) < n if n > 1.

Suppose first of all that n has many different prime factors. Then n must have many different
divisors, and so σ(n) must be large relative to n. But then many of the numbers 1, . . . , n cannot be
coprime to n, and so φ(n) must be small relative to n. On the other hand, suppose that n has very few
prime factors. Then n must have very few divisors, and so σ(n) must be small relative to n. But then
many of the numbers 1, . . . , n are coprime to n, and so φ(n) must be large relative to n. It therefore
appears that if one of the two values σ(n) and φ(n) is large relative to n, then the other must be small
relative to n. Indeed, our heuristics are upheld by the following result.

THEOREM 2U. For every n ∈ N, we have

1 σ(n)φ(n)
< ≤ 1.
2 n2

Proof. The result is obvious if n = 1, so suppose that n > 1. Let n = pu1 1 . . . pur r be the canonical
decomposition of n. Recall Theorems 2B and 2T. We have

−u −1
r u +1
pj j − 1 r
1 − pj j
σ(n) = =n
j=1
pj − 1 j=1
1 − p−1
j

and


r
φ(n) = n (1 − p−1
j ).
j=1

Hence

σ(n)φ(n) r
−u −1
2
= (1 − pj j ).
n j=1

The upper bound follows at once. On the other hand,


r n

−uj −1 1 n+1 1
(1 − pj )≥ (1 − p−2 ) ≥ 1− 2 = >
j=1 m=2
m 2n 2
p|n

as required. 

Combining Theorems 2J and 2U, we have the following result.

THEOREM 2V. We have φ(n)


n/ log n as n → ∞.

We now consider some average version of the Euler function.

THEOREM 2W. (MERTENS) As X → ∞, we have

 3 2
φ(n) = X + O(X log X).
π2
n≤X
Chapter 2 : Arithmetic Functions 2–11

Proof. As X → ∞, we have, by Theorem 2R, that


  n   n  
φ(n) = µ(m) = µ(m) = µ(m) r
m m
n≤X n≤X m|n m≤X n≤X m≤X r≤X/m
m|n
  

2
1 X X 1  X
= µ(m) 1+ = µ(m) + O(1)
2 m m 2 m
m≤X m≤X
   
X 2  µ(m)  1 
= 2
+ O X +O 1
2 m m
m≤X m≤X m≤X

 
X 2  µ(m) 2
 1
= +O X + O(X log X)
2 m=1 m2 m2
m>X

X  µ(m)
2
= + O(X log X).
2 m=1 m2

It remains to show that

∞
µ(m) 6
2
= 2.
m=1
m π

But
   
 ∞
 ∞
 ∞ ∞
 1  µ(m)  1     1 
2
=  µ(m) = µ(m) = 1,
n=1
n m=1
m2 k2 n,m k2
k=1 k=1 m|k
nm=k

in view of Theorem 2M. 

2.5. Dirichlet Convolution

We shall denote the class of all arithmetic functions by A, and the class of all multiplicative functions
by M.

Given arithmetic functions f, g ∈ A, we define the function f ∗ g : N → C by writing


 n
(f ∗ g)(n) = f (m) g
m
m|n

for every n ∈ N. This function is called the Dirichlet convolution of f and g.

It is not difficult to show that Dirichlet convolution of arithmetic functions is commutative and
associative. In other words, for every f, g, h ∈ A, we have

f ∗g =g∗f and (f ∗ g) ∗ h = f ∗ (g ∗ h).

Furthermore, the arithmetic function I : N → C, defined by I(1) = 1 and I(n) = 0 for every n ∈ N
satisfying n > 1, is an identity element for Dirichlet convolution. It is easy to check that I ∗ f = f ∗ I = f
for every f ∈ A.

On the other hand, an inverse may not exist under Dirichlet convolution. Consider, for example,
the function f ∈ A satisfying f (n) = 0 for every n ∈ N.
2–12 W W L Chen : Elementary Number Theory

THEOREM 2X. For any f ∈ A, the following two statements are equivalent:
(i) We have f (1) = 0.
(ii) There exists a unique g ∈ A such that f ∗ g = g ∗ f = I.

Proof. Suppose that (ii) holds. Then f (1)g(1) = 1, so that f (1) = 0. Conversely, suppose that
f (1) = 0. We shall define g ∈ A iteratively by writing

1
(5) g(1) =
f (1)

and

1  n
(6) g(n) = − f (d) g
f (1) d
d|n
d>1

for every n ∈ N satisfying n > 1. It is easy to check that this gives an inverse. Moreover, every inverse
must satisfy (5) and (6), and so the inverse must be unique. 

We now describe Theorem 2M and Möbius inversion in terms of Dirichlet convolution. Recall that
the function U ∈ A is defined by U (n) = 1 for all n ∈ N.

THEOREM 2Y.
(i) We have µ ∗ U = I.
(ii) If f ∈ A and g = f ∗ U , then f = g ∗ µ.
(iii) If g ∈ A and f = g ∗ µ, then g = f ∗ U .

Proof. (i) follows from Theorem 2M. To prove (ii), note that

g ∗ µ = (f ∗ U ) ∗ µ = f ∗ (U ∗ µ) = f ∗ I = f.

To prove (iii), note that

f ∗ U = (g ∗ µ) ∗ U = g ∗ (µ ∗ U ) = g ∗ I = g.

This completes the proof of Theorem 2Y. 

We conclude this chapter by exhibiting some group structure within A and M.

THEOREM 2Z. The sets A = {f ∈ A : f (1) = 0} and M = {f ∈ M : f (1) = 1} form abelian


groups under Dirichlet convolution.

Remark. Note that if f ∈ M is not identically zero, then f (n) = 0 for some n ∈ N. Clearly we have
f (n) = f (1)f (n), and so f (1) = 1.

Proof of Theorem 2Z. For A , this is now trivial. We now consider M . Clearly I ∈ M . If
f, g ∈ M and (m, n) = 1, then it follows from Theorem 1D that

  mn  

mn
(f ∗ g)(mn) = f (d) g = f (d1 d2 ) g
d d1 d2
d|mn d1 |m d2 |n
  



m n
= f (d1 ) g  f (d2 ) g  = (f ∗ g)(m)(f ∗ g)(n),
d1 d2
d1 |m d2 |n
Chapter 2 : Arithmetic Functions 2–13

so that f ∗ g ∈ M. Since (f ∗ g)(1) = f (1)g(1) = 0, we have f ∗ g ∈ M . It remains to show that if


f ∈ M , then f has an inverse in M . Clearly f has an inverse in A under Dirichlet convolution. Let
this inverse be h. We now define g ∈ A by writing g(1) = 1,

g(pk ) = h(pk )

for every prime p and k ∈ N, and



g(n) = g(pk )
pk n

for every n > 1. Then g ∈ M . Furthermore, for every integer n > 1, we have

(f ∗ g)(n) = (f ∗ g)(pk ) = (f ∗ h)(pk ) = I(pk ) = I(n),
pk n pk n pk n

so that g is an inverse of f . 

Problems for Chapter 2

1. Prove that d(n) ≤ d(2n − 1) for every n ∈ N.



2. Suppose that n ∈ N is composite. Prove that σ(n) > n + n.

3. Prove that d(n) is odd if and only if n ∈ N is a square.


1
4. Prove that m = n 2 d(n) for every n ∈ N.
m|n

5. Suppose that n ∈ N. Show that the number N of solutions of the equation x2 − y 2 = n in natural
numbers x and y satisfies

 d(n) − en if n is an odd number,
2N = 0 if n is twice an odd number,

d(n/4) − en if 4 | n,

where en = 1 if n is a perfect square, and en = 0 otherwise.

6. Prove that there are no squarefree perfect numbers apart from 6.


 1
7. Prove that = 2 for every perfect number n ∈ N.
m
m|n

8. Prove that every odd perfect number must have at least two distinct prime factors, exactly one of
which has odd exponent.

9. Suppose that a ∈ N satisfy a > 1. Let d run over all the divisors of a that have no more than m
prime divisors. Prove that
 
≥ 0 if m is even,
µ(d)
≤ 0 if m is odd.

[Hint: Write down first the canonical decomposition of a.]


2–14 W W L Chen : Elementary Number Theory

10. Suppose that k ∈ N is even, and the canonical decomposition of a ∈ N is of the form a = p1 p2 . . .√pk ,
where p1 , p2
, . . . , pk are distinct primes. Let d run over all the divisors of a such that 0 < d < a.
Prove that µ(d) = 0.

11. Prove that µ(d) = µ2 (n) for every n ∈ N.
d2 |n

[Hint: Distinguish between the cases when n is squarefree and when n is not squarefree.]

12. By first showing that the function f (n) = (−1)n−1 is multiplicative, evaluate the sum
 n
h(n) = (−1)m−1 µ for every n ∈ N.
m
m|n

 n
13. Explain why µ(m) σ = n for every n ∈ N.
m
m|n


n
nφ(n)
14. Prove that m= for every n ∈ N.
m=1
2
(m,n)=1

15. Suppose that n ∈ N satisfies φ(n) | n. Prove that n = 2a 3b for some non-negative integers a and b.

16. Suppose that p1 , p2 , . . . , pk ∈ N are distinct primes, and that there are no other primes.
(i) Let a = p1 p2 . . . pk . Explain why we must have φ(a) = 1.
(ii) Obtain a contradiction.
[Remark: This is yet another proof that there are infinitely many primes.]

17. Prove that σ(n) + φ(n) = nd(n) if and only if n ∈ N is prime.

18. Suppose that n = pu1 1 . . . pur r , where p1 < . . . < pr are primes and u1 , . . . , ur ∈ N.
(i) Write


n
s(n) = m2 .
m=1
(m,n)=1

Prove that
 s(d) n(n + 1)(2n + 1)
n2 = .
d2 6
d|n

(ii) Apply the Möbius inversion formula to deduce that


n
1 1
m2 = φ(n)n2 + (−1)r φ(n)p1 . . . pr .
m=1
3 6
(m,n)=1

19. For every n ∈ N, let Q(n) denote the number of squarefree numbers not exceeding n.

n  n
(i) Prove that n − Q(n) ≤ + , and deduce that Q(n) > n/2.
4 m=1 (2m + 1)2
(ii) Hence show that every natural number is a sum of two squarefree numbers.
Chapter 2 : Arithmetic Functions 2–15

20. An arithmetic function f : N → C is said to be completely multiplicative if f is not identically zero


and f (mn) = f (m)f (n) for all m, n ∈ N.
(i) Show that the Möbius function µ is not completely multiplicative.
(ii) Show that the Euler function φ is not completely multiplicative.
(iii) Suppose that f : N → C is multiplicative. Show that f is completely multiplicative if and only
if its Dirichlet inverse f −1 satisfies f −1 (n) = µ(n)f (n) for all n ∈ N.
(iv) Prove that the Liouville function λ : N → C, defined by λ(1) = 1 and λ(n) = (−1)u1 +...+ur if
n = pu1 1 . . . pur r , is completely multiplicative. Prove also that for every n ∈ N,
 
1 if n is a square,
λ(m) =
0 otherwise,
m|n

and λ−1 (n) = |µ(n)|.

21. Suppose that F : R+ → C, where R+ denotes the set of all positive real numbers. For any real
number X ≥ 1, let
 X

G(X) = F .
n
n≤X

Prove that


X
F (X) = µ(n) G for every real number X ≥ 1.
n
n≤X

22. Suppose that G : R+ → C. For any real number X ≥ 1, let




X
F (X) = µ(n) G .
n
n≤X

Prove that


X
G(X) = F for every real number X ≥ 1.
n
n≤X

23. Prove that each of the following identities is valid for every real number X ≥ 1:
 
X
(i) µ(n) = 1.
n
n≤X
 2
1  X 1
(ii) φ(n) = µ(n) + .
2 n 2
n≤X n≤X
 φ(n)  µ(n) X 
(iii) = .
n n n
n≤X n≤X

24. Suppose that the function F : R+ → C satisfies F (X) = 0 whenever 0 < X < 1. For any arithmetic
function α, we define the function α ◦ F : R+ → C by writing


X
(α ◦ F )(X) = α(n) F for every X ∈ R+ .
n
n≤X

(i) Prove that for any arithmetic functions α and β, we have α ◦ (β ◦ F ) = (α ∗ β) ◦ F .


2–16 W W L Chen : Elementary Number Theory

(ii) Suppose that the arithmetic function α has inverse α−1 under Dirichlet convolution. Prove
that if


X
G(X) = α(n) F for every real number X ∈ R+ ,
n
n≤X

then


−1 X
F (X) = α (n) G for every real number X ∈ R+ .
n
n≤X

[Hint: Note that the identity function I under Dirichlet convolution satisfies I ◦ F = F .]
[Remark: If α is completely multiplicative, then α−1 (n) = µ(n)α(n) for every n ∈ N by
Problem 20(iii). Hence




X X
G(X) = α(n) F if and only if F (X) = µ(n)α(n) G .
n n
n≤X n≤X

This is a generalization of Problems 21 and 22.]

 µ2 (m)
25. For every n ∈ N, let f (n) = .
φ(m)
m|n

(i) Prove that f (n) = n/φ(n) for every n ∈ N.


 1  µ2 (m)  1
(ii) Deduce that for every real number X ≥ 1, we have = .
φ(n) mφ(m) t
n≤X m≤X t≤X/m

 ∞

µ2 (m) µ2 (m) log m
(iii) Show that the series and both converge.
m=1
mφ(m) m=1
mφ(m)
 ∞
1 µ2 (m)
(iv) Deduce that as X → ∞, we have ∼ C log X, where C = .
φ(n) m=1
mφ(m)
n≤X

26. Consider a square lattice consisting of all points (a, b), where a, b ∈ Z. Two lattice points P and Q
are said to be mutually visible if the line segment which joins them contains no lattice points other
than the endpoints P and Q.
(i) Prove that (a, b) and (0, 0) are mutually visible if and only if a and b are relatively prime.
(ii) We shall prove that the set of lattice points visible from the origin has density 6/π 2 . Consider
a large square region on the xy-plane defined by the inequalities |x| ≤ r and |y| ≤ r. Let N (r)
denote the number of lattice points in this square, and let N  (r) denote the number of these
which are visible from the origin. The eight lattice points nearest the origin are all visible from
the origin. By symmetry, N  (r) is equal to 8 plus 8 times the number of visible points in the
region {(x, y) : 2 ≤ x ≤ r and 1 ≤ y ≤ x}. Prove that


r
N  (r) = 8 φ(n).
n=1

Obtain an asymptotic formula for N (r), and show that

N  (r) 6
→ 2 as r → ∞.
N (r) π
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 3
CONGRUENCES

3.1. Introduction

Suppose that m ∈ N and a, b ∈ Z. Then we say that a is congruent to b modulo m, denoted by a ≡ b


(mod m), if m | (a − b).

Suppose that m ∈ N and c ∈ Z. Then by Theorem 1A, there exist unique q, r ∈ Z such that
c = mq + r and 0 ≤ r < m. The number r is called the residue of c modulo m, and c is said to belong
to the residue class r modulo m.

We make no notational distinction between numbers r ∈ Z and the residue classes r. We shall use
the convention that whenever r denotes a residue class, this will be explicitly stated in the text.

The following three results are simple consequences of our definition.

THEOREM 3A. Suppose that m ∈ N and a, b ∈ Z. Then a ≡ b (mod m) if and only if a and b
belong to the same residue class modulo m.

Proof. Suppose that a ≡ b (mod m). If a belongs to the residue class r modulo m, where r ∈ Z and
0 ≤ r < m, then there exists q1 ∈ Z such that a = mq1 + r. Since a ≡ b (mod m), there exists q ∈ Z
such that b = a + mq. It follows that b = m(q1 + q) + r, and so b also belongs to the residue class r
modulo m.

Conversely, suppose that a and b belong to the same residue class r modulo m, where 0 ≤ r < m.
Then there exist q1 , q2 ∈ Z such that a = mq1 + r and b = mq2 + r. It follows that a − b = m(q1 − q2 ),
and so a ≡ b (mod m). 
3–2 W W L Chen : Elementary Number Theory

THEOREM 3B. Suppose that m ∈ N, and a1 , a2 , b1 , b2 ∈ Z. Suppose further that a1 ≡ b1 (mod m)


and a2 ≡ b2 (mod m). Then
(i) a1 + a2 ≡ b1 + b2 (mod m); and
(ii) a1 a2 ≡ b1 b2 (mod m).

Proof. (i) is trivial. (ii) follows from a1 a2 − b1 b2 = (a1 − b1 )a2 + b1 (a2 − b2 ). 

THEOREM 3C. Suppose that m ∈ N, and a, b, c ∈ Z with c = 0.


(i) If ac ≡ bc (mod m), then a ≡ b (mod m/(c, m)).
(ii) If further that (c, m) = 1, then a ≡ b (mod m).

The proof is left as an exercise.

3.2. Sets of Residues

Suppose that m ∈ N. Consider the set M = {0, 1, 2, . . . , m − 1}. A set S of m integers is said to be a
complete set of residues modulo m if for every integer a ∈ M , there exists a unique element x ∈ S such
that x ≡ a (mod m). It is easy to see that S is a complete set of residues modulo m if and only if S
contains exactly m elements and x ≡ y (mod m) for any distinct x, y ∈ S.

On the other hand, the subset M ∗ = {a ∈ M : (a, m) = 1} has φ(m) elements. A set T of φ(m)
integers is said to be a reduced set of residues modulo m if for every integer a ∈ M ∗ , there exists a
unique element x ∈ T such that x ≡ a (mod m). It is easy to see that T is a reduced set of residues
modulo m if and only if T contains exactly φ(m) elements, all coprime to m, and x ≡ y (mod m) for
any distinct x, y ∈ T .

Examples. (i) The set {2, 4, 6} is a complete set of residues modulo 3. The subset {2, 4} is a reduced
set of residues modulo 3.

(ii) Suppose that p is prime. The set {1, 2, . . . , p} is a complete set of residues modulo p. The subset
{1, 2, . . . , p − 1} is a reduced set of residues modulo p.

THEOREM 3D. Suppose that m ∈ N and k ∈ Z \ {0}, where (k, m) = 1.


(i) As x runs through a complete set of residues modulo m, kx runs through a complete set of residues
modulo m.
(ii) As x runs through a reduced set of residues modulo m, kx runs through a reduced set of residues
modulo m.

Proof. (i) Suppose that S is a complete set of residues modulo m. If x, y ∈ S and x ≡ y (mod m),
then it follows from Theorem 3C(ii) that kx ≡ ky (mod m). Hence the set {kx : x ∈ S} is a set of m
integers that are pairwise incongruent modulo m, and so is a complete set of residues modulo m.

(ii) Suppose that T is a reduced set of residues modulo m. A similar argument shows that the set
{kx : x ∈ T } is a set of φ(m) integers that are pairwise incongruent modulo m. On the other hand, we
know that (kx, m) = 1 whenever (x, m) = 1. It follows that the elements in the set {kx : x ∈ T } are
coprime to m, and so the set is a reduced set of residues modulo m. 

THEOREM 3E. Suppose that a, b ∈ N, and (a, b) = 1.


(i) As x runs through a complete set of residues modulo a and y runs through a complete set of residues
modulo b, bx + ay runs through a complete set of residues modulo ab.
(ii) As x runs through a reduced set of residues modulo a and y runs through a reduced set of residues
modulo b, bx + ay runs through a reduced set of residues modulo ab.
Chapter 3 : Congruences 3–3

Proof. (i) If bx1 + ay1 ≡ bx2 + ay2 (mod ab), then bx1 ≡ bx2 (mod a). It follows from Theorem 3C(ii)
that x1 ≡ x2 (mod a). Similarly y1 ≡ y2 (mod b).

(ii) Since (a, b) = 1, we have φ(ab) = φ(a)φ(b). Suppose that (x, a) = 1 and (y, b) = 1. Then it is
easy to check that

(bx + ay, a) = (bx, a) = (x, a) = 1.

Similarly,

(bx + ay, b) = (ay, b) = (y, b) = 1.

It follows easily that (bx + ay, ab) = 1. 

3.3. Some Interesting Congruences

As an application of Theorem 3D, we prove the following famous result.

THEOREM 3F. (FERMAT-EULER) Suppose that m ∈ N and a ∈ Z \ {0}, where (a, m) = 1. Then
aφ(m) ≡ 1 (mod m).

Proof. Suppose that r1 , . . . , rφ(m) form a reduced set of residues modulo m. Then it follows from
Theorem 3D that ar1 , . . . , arφ(m) also form a reduced set of residues modulo m. Thus

r1 . . . rφ(m) ≡ (ar1 ) . . . (arφ(m) ) = aφ(m) r1 . . . rφ(m) (mod m).

Clearly (r1 . . . rφ(m) , m) = 1. Hence aφ(m) ≡ 1 (mod m), in view of Theorem 3C(ii). 

A special case of Theorem 3F is the following.

THEOREM 3G. (FERMAT’S LITTLE THEOREM) Suppose that p is a prime and a ∈ Z, where
p  a. Then ap−1 ≡ 1 (mod p).

3.4. Some Linear Congruences

Suppose that f : Z → Z is a given polynomial with integer coefficients, and m ∈ N. By the number of
solutions of the congruence f (x) ≡ 0 (mod m), we mean the number of elements x in a complete set of
residues modulo m for which the congruence holds; in other words, the number of incongruent numbers
x modulo m for which the congruence holds.

Our first result concerns the simplest of congruences.

THEOREM 3H. Suppose that m ∈ N and a, b ∈ Z. Then the congruence

(1) ax ≡ b (mod m)

is soluble if and only if (a, m) | b. In this case, the number of solutions is equal to (a, m), and the
congruence is satisfied by precisely all the numbers in a certain residue class modulo m/(a, m).

Proof. The result is trivial if a = 0, so suppose that a = 0. If (1) is soluble, then there exist x0 , y0 ∈ Z
such that ax0 + my0 = b, and so (a, m) | b. Conversely, suppose that (a, m) | b. Since
 
a m
, = 1,
(a, m) (a, m)
3–4 W W L Chen : Elementary Number Theory

it follows from Theorem 3D that the integers


 
a 2a m a
0, , ,..., −1
(a, m) (a, m) (a, m) (a, m)

form a complete set of residues modulo m/(a, m). Hence one of the numbers x0 in the set
 
m
0, 1, . . . , −1
(a, m)

must satisfy
 
a b m
(2) x0 ≡ mod ,
(a, m) (a, m) (a, m)

whence

(3) ax0 ≡ b (mod m),

and so (1) is soluble.

Furthermore, if x ≡ x0 (mod m/(a, m)), then (2) and hence also (3) hold with x0 replaced by x.
To show that the residue class x0 modulo m/(a, m) gives all the solutions, let x be any solution of (1).
Then a(x − x0 ) ≡ 0 (mod m). It follows from Theorem 3C(i) that x − x0 ≡ 0 (mod m/(a, m)). 

Our next result concerns simultaneous linear congruences.

THEOREM 3J. (CHINESE REMAINDER THEOREM) Suppose that n > 1, and that the natural
numbers m1 , . . . , mn ∈ N are pairwise coprime; in other words, (mi , mj ) = 1 whenever 1 ≤ i < j ≤ n.
Then for any a1 , . . . , an ∈ Z, the simultaneous congruences

x ≡ a1 (mod m1 )
..
.
x ≡ an (mod mn )

are satisfied by precisely the members of a unique residue class modulo m1 . . . mn .

Proof. For every j = 1, . . . , n, write qj = m1 . . . mj−1 mj+1 . . . mn . Then (qj , mj ) = 1. It follows from
Theorem 3H that there exists kj ∈ Z such that qj kj ≡ aj (mod mj ). Now let


n
x0 = qj kj .
j=1

If x ≡ x0 (mod m1 . . . mn ), then x ≡ x0 ≡ qi ki ≡ ai (mod mi ) for every i = 1, . . . , n. On the other hand,


if x is a solution to the simultaneous congruences, then x ≡ ai ≡ x0 (mod mi ) for every i = 1, . . . , n.
Hence x ≡ x0 (mod m1 . . . mn ). 

3.5. Some Polynomial Congruences

Our first result follows from Fermat’s little theorem.

THEOREM 3K. Suppose that p is prime. Then for any polynomial f : Z → Z with integer coeffi-
cients, there exists a polynomial g : Z → Z with integer coefficients and of degree less than p such that
f (x) ≡ g(x) (mod p) for every x ∈ Z.
Chapter 3 : Congruences 3–5

Proof. In view of Theorem 3B, it suffices to prove Theorem 3K for the polynomial f (x) = xn , where
n is a fixed positive integer. It is not difficult to show that there exist q, r ∈ Z such that n = (p − 1)q + r
and 1 ≤ r ≤ p − 1. If p  x, then it follows from Theorem 3G that

xn = (xp−1 )q xr ≡ 1q xr ≡ xr (mod p),

whence the result. If p | x, then x ≡ 0 (mod p), so that xn ≡ 0 ≡ xr (mod p). 

Having reduced the degree of the polynomial, we now show that in many cases, we cannot have too
many solutions.

THEOREM 3L. (LAGRANGE) Suppose that f (x) = an xn + an−1 xn−1 + . . . + a0 is a polynomial


with integer coefficients. Suppose further that p is prime, and p  an . Then the congruence

(4) f (x) ≡ 0 (mod p)

has at most n solutions.

Proof. The case n = 0 is trivial. The case n = 1 follows from Theorem 3H. Let n > 1 and assume
that the result is true for all polynomials of degree n − 1. Suppose on the contrary that (4) has at least
n + 1 incongruent solutions x0 , x1 , . . . , xn . Then

n 
n
f (x) − f (x0 ) = ak (xk − xk0 ) = (x − x0 ) ak (xk−1 + xk−2 x0 + . . . + xk−1
0 ) = (x − x0 )g(x),
k=1 k=1

where g(x) = an x + . . . . It follows that (xj − x0 )g(xj ) ≡ 0 (mod p) for every j = 1, . . . , n, and so
n−1

g(xj ) ≡ 0 (mod p), contradicting the inductive hypothesis. 

On the other hand, if a polynomial has many solutions, then we can say quite a lot about its
coefficients.

THEOREM 3M. Suppose that f (x) = an xn + an−1 xn−1 + . . . + a0 is a polynomial with integer
coefficients. Suppose further that p is prime, and the congruence f (x) ≡ 0 (mod p) has more than n
solutions. Then p | aj for every j = 0, 1, . . . , n.

Proof. Suppose on the contrary that some coefficient is not divisible by p. Let k be the largest index
such that p  ak . Then k ≤ n. On the other hand, since

an xn + an−1 xn−1 + . . . + ak+1 xk+1 ≡ 0 (mod p)

for every x ∈ Z, it follows that the congruence

ak xk + ak−1 xk−1 + . . . + a0 ≡ 0 (mod p)

has more than k solutions, contradicting Theorem 3L. 

We conclude this section by using polynomial congruences to prove an interesting congruence result.

THEOREM 3N. (WILSON) For every prime p, we have

(p − 1)! ≡ −1 (mod p).

Proof. The polynomial


p−1
f (x) = (xp−1 − 1) − (x − m)
m=1

has degree at most (p−2), but has (p−1) roots modulo p, in view of Theorem 3G. It follows from Theorem
3M that all the coefficients are divisible by p. Note that the coefficient of x0 is −1 − (−1)p−1 (p − 1)!. 
3–6 W W L Chen : Elementary Number Theory

Remark. We can also prove Wilson’s theorem in the following way. The theorem is obvious if p ≤ 3,
so we assume that p > 3. Suppose that x ≡ 0 (mod p). Then it follows from Theorem 3H that there
exists a unique x modulo p such that xx ≡ 1 (mod p). Moreover, if x ≡ x (mod p), then x ≡ 1 (mod p)
or x ≡ −1 (mod p). It follows that the numbers 2, 3, . . . , p − 2 can be paired off into (p − 3)/2 mutually
reciprocal pairs modulo p, so that (p − 2)! ≡ 1 (mod p). The result follows easily.

3.6. Primitive Roots

Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. Then there exist numbers n ∈ N such that

(5) an ≡ 1 (mod m).

For example, as shown in Theorem 3F, the number n = φ(m) satisfies the requirement. The smallest
n ∈ N for which the congruence (5) holds is called the exponent to which a belongs modulo m.

THEOREM 3P. Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. If a belongs to the exponent
n modulo m, then the numbers 1, a, a2 , . . . , an−1 are incongruent modulo m.

Proof. Suppose on the contrary that there exist , k ∈ Z such that 0 ≤  < k ≤ n − 1 and a ≡ ak
(mod m). Then ak− ≡ 1 (mod m). But k −  < n, and this contradicts the minimality of n. 

THEOREM 3Q. Suppose that a ∈ Z \ {0} and m ∈ N, where (a, m) = 1. Suppose further that a
belongs to the exponent n modulo m, and , k ∈ N ∪ {0}. Then a ≡ ak (mod m) if and only if  ≡ k
(mod n). In particular, a ≡ 1 (mod m) if and only if n | .

Proof. There exist u, v, r, s ∈ Z with 0 ≤ r, s < n such that  = nu + r and k = nv + s. Since , k ≥ 0,


it follows that u, v ≥ 0. By Theorem 3A, we have  ≡ k (mod n) if and only if r = s. On the other
hand, we have

a = (an )u ar ≡ ar (mod m)

and

ak = (an )v as ≡ as (mod m).

By Theorem 3P, we have ar ≡ as (mod m) if and only if r = s. The result follows. 

An immediate consequence of Theorems 3F and 3Q is that the exponent to which a belongs modulo
m is a divisor of φ(m). However, if the exponent to which a belongs modulo m is actually φ(m), then
we say that a is a primitive root modulo m.

A natural question is then to determine those values of m ∈ N for which primitive roots modulo m
exist. Thanks to Gauss, we have a complete answer to this interesting question.

3.7. A Theorem of Gauss

Our first task is to show that there are certain values of m ∈ N for which primitive roots modulo m
exist. We have the following three results.

THEOREM 3R. Suppose that p is prime. Then for every n ∈ N satisfying n | (p − 1), there are
exactly φ(n) incongruent numbers modulo p which belong to the exponent n modulo p. In particular,
there are φ(p − 1) = φ(φ(p)) primitive roots modulo p.
Chapter 3 : Congruences 3–7

Proof. Suppose that n | (p − 1). Let ψ(n) denote the number of incongruent numbers modulo p which
belong to the exponent n modulo p. We shall show that ψ(n) = φ(n). To see this, let θ(n) denote the
number of solutions of the congruence

(6) xn ≡ 1 (mod p).

By Theorem 3Q, an integer x is a solution of (6) if and only if the exponent k to which x belongs modulo
p satisfies k | n. Hence

θ(n) = ψ(k).
k|n

Note next that

xp−1 − 1 = (xn − 1)(xp−1−n + xp−1−2n + . . . + xn + 1).

By Fermat’s little theorem, the congruence

xp−1 − 1 ≡ 0 (mod p)

has exactly p − 1 solutions. On the other hand, by Lagrange’s theorem, the congruence (6) has at most
n solutions and the congruence

xp−1−n + xp−1−2n + . . . + xn + 1 ≡ 0 (mod p)

has at most p − 1 − n solutions. It follows that (6) must have exactly n solutions, and so

ψ(k) = n.
k|n

It now follows from the Möbius inversion formula and Theorem 2R that
 n
ψ(n) = µ(k) = φ(n).
k
k|n

This completes the proof. 

THEOREM 3S. Suppose that p is an odd prime, and g is a primitive root modulo p. Then there
exists t ∈ Z such that the integer u, defined by the equation

(g + pt)p−1 = 1 + pu,

is not divisible by p. In this case, g + pt is a primitive root modulo pr for every r ∈ N.

Proof. Since g p−1 = 1 + pq for some q ∈ Z, it follows that there exist r, s ∈ Z such that

(7) (g + px)p−1 = 1 + pq + (p − 1)g p−2 px + p2 r


= 1 + p(q − xg p−2 + ps)
= 1 + py,

where

y = q − xg p−2 + ps ≡ q − xg p−2 (mod p).

As x runs through a complete set of residues modulo p, so does y, in view of Theorem 3D. Hence there
exists a value of x, say t, for which p  y, and let u be the corresponding value of y. It follows from (7)
that for this value of t, we have

(g + pt)(p−1)p = (1 + pu)p = 1 + p2 u + p3 u = 1 + p2 u2 ,
3–8 W W L Chen : Elementary Number Theory

where p  u2 . Similarly,
2
(g + pt)(p−1)p = 1 + p3 u3 ,

where p  u3 , and so on. Suppose that (g +pt) belongs to the exponent n modulo pr , so that (g +pt)n ≡ 1
(mod pr ). Then clearly (g + pt)n ≡ 1 (mod p), and so g n ≡ 1 (mod p). Since g is a primitive root
modulo p, we must have (p − 1) | n. On the other hand, n | φ(pr ) = pr−1 (p − 1). Hence n = ps−1 (p − 1)
for some integer s satisfying 1 ≤ s ≤ r. Recall now that
s−1
(g + pt)n = (g + pt)(p−1)p = 1 + ps u s ,

where p  us . It follows that

1 + ps u s ≡ 1 (mod pr ),

so that ps us ≡ 0 (mod pr ). We therefore must have s = r, and so n = φ(pr ). 

THEOREM 3T. Suppose that p is an odd prime, and g is an odd primitive root modulo pr , where
r ∈ N. Then g is a primitive root modulo 2pr .

Remark. Note that since there exist primitive roots modulo pr , there must exist odd primitive roots
modulo pr . To see this, note that if h is an even primitive root modulo pr , then g = h + pr is an odd
primitive root modulo pr .

Proof of Theorem 3T. Note first of all that every odd integer x which satisfies xn ≡ 1 (mod pr )
clearly satisfies xn ≡ 1 (mod 2pr ), and vice versa. It follows that if g is an odd primitive root modulo
pr , then it belongs to the exponent φ(pr ) modulo 2pr . Note, however, that φ(pr ) = φ(2pr ). 

We are now in a position to determine precisely those values of m ∈ N for which primitive roots
modulo m exist. We prove the following beautiful result.

THEOREM 3U. (GAUSS) Suppose that m ∈ N and m > 1. Then there exist primitive roots modulo
m if and only if m = 2, 4, pr , 2pr , where p is an odd prime and r ∈ N.

Proof. For m = 4, it is easy to check that 3 is a primitive root. The existence of primitive roots to
the other moduli follows from the previous three theorems.

Suppose now that m = pu1 1 . . . pur r , where the natural numbers p1 < . . . < pr are primes and the
integers ui > 0 for i = 1, . . . , r. For every i = 1, . . . , r, write mi = pui i , so that m = m1 . . . mr , and let
 = [φ(m1 ), . . . , φ(mr )] be the least common multiple of φ(m1 ), . . . , φ(mr ). Suppose now that a ∈ Z\{0}
and (a, m) = 1. For every i = 1, . . . , r, we have by Theorem 3F that aφ(mi ) ≡ 1 (mod mi ), so that a ≡ 1
(mod mi ). It follows that a ≡ 1 (mod m). We have to show that if m is not one of the stated values,
then

 < φ(m) = φ(m1 ) . . . φ(mr ).

If p is a prime, then φ(pu ) = pu−1 (p − 1) is even if p > 2 or if p = 2 and u ≥ 2, and so φ(pu ) is even
whenever pu > 2. It follows that if two of the values m1 , . . . , mr exceed 2, then  < φ(m). It remains
to show that there are no primitive roots modulo 2u , where u ≥ 3. We shall do this by proving that for
every odd integer a and every integer u ≥ 3, we have
1 u
(8) a 2 φ(2 )
≡1 (mod 2u ).

For u = 3, we note that a2 ≡ 1 (mod 8). Suppose now that (8) holds for u = k; in other words, suppose
that
1 k
)
a 2 φ(2 = 1 + 2k t,
Chapter 3 : Congruences 3–9

where t ∈ Z. Squaring both sides, we obtain


k
aφ(2 )
= 1 + 2k+1 t + 22k t2 ≡ 1 (mod 2k+1 ).

This completes the proof, since φ(2k ) = 12 φ(2k+1 ). 

Problems for Chapter 3

1. Prove that 7 | (32n+1 + 2n+2 ) for every n ∈ N.

2. Prove that every year, including a leap year, has a Friday 13th.
[Hint: For those who are superstitious, prove instead that every year, including a leap year, has a
Sunday 1st. The two statements are the same!]

3. Prove that 5n3 + 7n5 ≡ 0 (mod 12) for every n ∈ Z.

4. Show that 12 , 22 , . . . , m2 is not a complete set of residues modulo m if m > 2.

5. Suppose that a, b, p ∈ N and p is prime. Show that (a + b)p ≡ ap + bp (mod p).

6. Suppose that a, b, p ∈ N and p > 2 is prime. Show that if ap + bp ≡ 0 (mod p), then we must have
ap + bp ≡ 0 (mod p2 ).

7. Suppose that p > 2 is a prime. Show that 1p + 2p + . . . + (p − 1)p ≡ 0 (mod p).

8. Find all x ∈ Z such that simultaneously x ≡ 1 (mod 2), x ≡ 2 (mod 3), x ≡ 3 (mod 5).
m 
 
ax + b m−1
9. Suppose that m ∈ N and a, b ∈ Z such that (a, m) = 1. Prove that = .
x=1
m 2

10. Suppose that m1 , . . . , mk are integers greater than 1 and which are pairwise coprime, and write
m = m1 . . . mk . Suppose further that x1 , . . . , xk , x run through complete sets of residues and
y1 , . . . , yk , y run through reduced sets of residues modulo m1 , . . . , mk , m respectively. Prove that
the fractions
  x
x1 xk
+ ... + and
m1 mk m
coincide, as do the fractions
  y
y1 yk
+ ... + and .
m1 mk m

11. Suppose that a, m ∈ N satisfy (a, m) = 1 and m > 1. Prove that



m  ay  φ(m)
= .
y=1
m 2
(y,m)=1


d
x
[Hint: Denote the above sum by S(m), and show that for every d ∈ N, we have S(d) = .
x=1
d
(x,d)=1

Then consider the sum S(d), and use the Möbius inversion formula.]
d|m
3–10 W W L Chen : Elementary Number Theory

12. The number g = 10100 is called a googol. Show that there exist a googol consecutive integers each
of which is divisible by the square of a prime.
[Hint: Use the Chinese remainder theorem.]
[Remark: Can you prove that there are arbitrarily long gaps between consecutive primes?]

13. Suppose that p is a prime. Suppose further that h and k are non-negative integers such that
h + k = p − 1. Prove that h!k! + (−1)h ≡ 0 (mod p).

14. Suppose that p is a odd prime.


p+1
(i) Prove that 12 32 52 . . . (p − 2)2 ≡ 22 42 62 . . . (p − 1)2 ≡ (−1) 2 (mod p).

2
p−1 p+1
(ii) Deduce that ! ≡ (−1) 2 (mod p).
2

15. Suppose that p is a prime, and that n ∈ Z.


 

n n
(i) Prove that ≡ (mod p).
p p
 
n n
(ii) Suppose that α ∈ N and pα divides . Prove that pα also divides .
p p

16. Suppose that n ∈ N, and that there exists a ∈ Z such that n | (an−1 − 1). Suppose further that
n  (ax − 1) whenever 1 ≤ x ≤ n − 2. Show that n is prime.

17. Let


p−1
Sn (p) = kn ,
k=1

where p is an odd prime and n is an integer greater than 1. Prove that Sn (p) ≡ −1 (mod p) if
(p − 1) | n, and that Sn (p) ≡ 0 (mod p) if (p − 1)  n.

p−2
[Hint: Let g be a primitive root modulo p. Show that Sn (p) ≡ g jn (mod p).]
j=0

18. Suppose that p is a prime. Prove that the sum of the primitive roots modulo p is congruent to
µ(p − 1) modulo p.

19. Suppose that p > 3 is a prime. Prove that the product of the primitive roots modulo p is congruent
to 1 modulo p.

20. Suppose that p > 2 is a prime.


(i) Prove that for any integer a > 1, any odd prime divisor of ap − 1 either divides a − 1 or is of
the form 2px + 1.
(ii) Prove that for any integer a > 1, any odd prime divisor of ap + 1 either divides a + 1 or is of
the form 2px + 1.
(iii) Prove that there are infinitely many primes of the form 2px + 1.
n
(iv) Prove that any prime divisor of 22 + 1, where n ∈ N, is of the form 2n+1 x + 1.

21. Suppose that a, n ∈ N and a > 1. Prove that n divides φ(an − 1).
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 4
QUADRATIC RESIDUES

4.1. Introduction

In this chapter, we are concerned with quadratic congruences of the form

(1) x2 ≡ a (mod p),

where p is an odd prime and a ∈ Z. We are interested in determining whether for given p and a, the
congruence (1) has a solution x ∈ Z.

If a ≡ 0 (mod p), then clearly (1) is soluble, with x ≡ 0 (mod p) being the only solution. We
therefore make the assumption that a ≡ 0 (mod p). If (1) is soluble, then we say that a is a quadratic
residue modulo p. If (1) is not soluble, then we say that a is a quadratic non-residue modulo p.

THEOREM 4A. Suppose that p is an odd prime. Then there are precisely (p−1)/2 quadratic residues
modulo p, and these are represented by the numbers
 2
p−1
(2) 12 , 22 , . . . , .
2

Proof. Suppose that p  a. Then it follows from Lagrange’s theorem that the congruence (1) has at
most two solutions. On the other hand, if x ≡ b (mod p) is a solution, then it is easy to check that
x ≡ p − b (mod p) represents another solution. It follows that the congruence (1) has either two solutions
or no solutions. Note next that any solution of the congruence (1) must be of the form x ≡ b (mod p),
with 1 ≤ b ≤ p − 1. It follows that there can be at most (p − 1)/2 quadratic residues modulo p. It
remains to show that there are at least (p − 1)/2 quadratic residues modulo p. To do so, note that the
(p − 1)/2 numbers in (2) are clearly quadratic residues modulo p. It therefore suffices to show that they
4–2 W W L Chen : Elementary Number Theory

are incongruent modulo p. Suppose on the contrary that x2 ≡ y 2 (mod p), with 1 ≤ x < y ≤ (p − 1)/2.
Then p | (y − x)(y + x), a contradiction since p is prime and 0 < y − x < y + x < p. 

It is convenient to introduce the Legendre symbol, defined as follows. Suppose that p is an odd
prime. Then we write

  1 if p  a and a is a quadratic residue modulo p,
a
= −1 if p  a and a is a quadratic non-residue modulo p,
p L 
0 if p | a.

4.2. The Legendre Symbol

In this section, we analyze the Legendre symbol in a systematic way to provide some practical means of
evaluating its value. Our first step is not particularly useful in itself, but provides a path towards results
of a more practical nature.

THEOREM 4B. (EULER’S CRITERION) Suppose that p is an odd prime. For every a ∈ Z, we
have
 
a p−1
≡a 2 (mod p).
p L

Proof. The result clearly holds if p | a, so we assume now that p  a. If a is a quadratic residue
modulo p, then there exists x ∈ Z such that p  x and x2 ≡ a (mod p). It follows from Fermat’s little
theorem that
 
p−1 a
a 2 ≡ xp−1 ≡ 1 = (mod p).
p L

Consider next the congruence


 p−1
 p−1

a 2 −1 a 2 +1 ≡0 (mod p).

By Fermat’s little theorem, this has p − 1 solutions. On the other hand, by Lagrange’s theorem, neither
p−1
(3) a 2 −1≡0 (mod p)

nor
p−1
(4) a 2 +1≡0 (mod p)

has more than (p − 1)/2 solutions. It follows that each of (3) and (4) has exactly (p − 1)/2 solutions.
The (p − 1)/2 quadratic residues a modulo p all satisfy (3). It follows that all the quadratic non-residues
a must satisfy (4). 

We have immediately the following two consequences.

THEOREM 4C. Suppose that p is an odd prime. Then


 
−1 p−1
= (−1) 2 .
p L
Chapter 4 : Quadratic Residues 4–3

Proof. Taking a = 1 in Theorem 4B, we obtain


 
−1 p−1
≡ (−1) 2 (mod p).
p L

Note, however, that


 
−1 p−1
− (−1) 2 ∈ {−2, 0, 2}.
p L

The result follows. 

THEOREM 4D. Suppose that p is an odd prime. Then for every a, b ∈ Z, we have
    
ab a b
= .
p L p L p L

Proof. The result is trivial if p | a or p | b, so we assume now that p  a and p  b. It follows from
Theorem 4B that
    
ab p−1 p−1 p−1 a b
≡ (ab) 2 ≡ a 2 b 2 ≡ (mod p).
p L p L p L

Note, however, that


    
ab a b
− ∈ {−2, 0, 2}.
p L p L p L

The result follows. 

In practice, Euler’s criterion is not very useful when p is a rather large prime. The following
represents a result of a more practical nature.

THEOREM 4E. (GAUSS’S LEMMA) Suppose that p is an odd prime, and the integer a ∈ Z
satisfies p  a. Let


p p ax
m = # x ∈ N : 1 ≤ x < and < ax − p <p ;
2 2 p

in other words, m is the number of integers x satisfying 1 ≤ x < p/2 for which the residue rx of ax
satisfies p/2 < rx < p. Then
 
a
= (−1)m .
p L

Proof. By Euler’s criterion, we have


p−1

p−1
     
2 2
p−1 p−1 a p−1
(5) rx ≡ ax = a 2 !≡ ! (mod p).
x=1 x=1
2 p L 2

Let α1 , . . . , αm denote the m values of rx for which p/2 < rx < p, and let β1 , . . . , β , where  + m =
(p − 1)/2, denote the  values of rx for which 0 < rx < p/2. Then
p−1 m    m   
2
(6) rx = αi  βj  ≡ (−1)m (p − αi )  βj  (mod p).
x=1 i=1 j=1 i=1 j=1
4–4 W W L Chen : Elementary Number Theory

Clearly, for every i = 1, . . . , m, we have 0 < p − αi < p/2. Also, for every j = 1, . . . , , we have
0 < βj < p/2. Note also that the numbers α1 , . . . , αm are distinct, and the numbers β1 , . . . , β are also
distinct. Furthermore, for every i = 1, . . . , m and every j = 1, . . . , , the numbers p − αi and βj are
different, for p − αi = βj would give ax ≡ −ay (mod p), and hence x + y ≡ 0 (mod p), for some x, y ∈ Z
satisfying 1 ≤ x < y ≤ (p − 1)/2, clearly impossible. Hence
m   
 
p−1
(7) (p − αi )  
βj = !.
i=1 j=1
2

The result now follows on combining (5)–(7). 

THEOREM 4F. Suppose that p is an odd prime. Then


 
2 p p p2 −1
= (−1)[ 2 ]−[ 4 ] = (−1) 8 .
p L

Proof. The numbers 2, 4, 6, . . . , p − 1 all lie between 0 and p, and so are their own residues modulo
p. Moreover, p/2 < 2x < p if and only if p/4 < x < p/2. Hence we must have m = [p/2] − [p/4]. The
second equality is obtained by checking. 

4.3. Quadratic Reciprocity

Suppose that p, q ∈ N are distinct odd primes. There is a beautiful result which links the solubility of
the two quadratic congruences

x2 ≡ q (mod p) and x2 ≡ p (mod q),

in the sense that if we know whether one of these two congruences is soluble, then the determination of
whether the other congruence is soluble involves only a simple calculation.

THEOREM 4G. (LAW OF QUADRATIC RECIPROCITY) Suppose that p, q ∈ N are distinct odd
primes. Then
  
q p p−1 q−1
= (−1)( 2 )( 2 ) .
p L q L

Theorem 4G will follow from the following three results.

THEOREM 4H. Suppose that p is an odd prime, and the integer a ∈ Z satisfies p  a. Then
 
a
= (−1)n ,
p L

where

p−1
 2ay
n= .
y=1
p

Proof. We shall use Gauss’s lemma. In the notation of Theorem 4E, we have


p−1 2ax ax
m=# x∈N:1≤x≤ and 1 < −2 <2 .
2 p p
Chapter 4 : Quadratic Residues 4–5

Also, for any x ∈ Z satisfying 1 ≤ x ≤ (p − 1)/2, we must have


2ax ax
0< −2 < 2.
p p

Hence

p−1
 2ax ax
m= −2 ≡n (mod 2).
x=1
p p

This completes the proof. 

THEOREM 4J. Suppose that p, q ∈ N are distinct odd primes. Then


 
q
= (−1)λ(p,q) ,
p L

where

p−1
 qx
λ(p, q) = .
x=1
p

Proof. Suppose that a ∈ Z is odd. Then 2 | (a + p), and


1    1        
2 (a + p) 4 2 (a + p) 2(a + p) 2a 2 a
= = = = ,
p L p L p L p L p L p L p L

in view of Theorem 4D. It follows from Theorem 4H that


   1 
2 a (a + p)
= 2 = (−1)r ,
p L p L p L

where

p−1
 (a + p)y
r= .
y=1
p

Now

2
2 
 2

p−1 p−1 p−1


 (a + p)y  ay  ay p2 − 1
= +y = + .
y=1
p y=1
p y=1
p 8

Putting a = 1, we deduce (again) that


 
2 p2 −1
= (−1) 8 .
p L

It now follows that for odd prime q, we must have

  2
p−1


q qy
= (−1)s , where s= .
p L y=1
p

This completes the proof. 


4–6 W W L Chen : Elementary Number Theory

THEOREM 4K. Suppose that p, q ∈ N are distinct odd primes. Then in the notation of Theorem
4J, we have
  
p−1 q−1
λ(p, q) + λ(q, p) = .
2 2

Proof. We have
 qx
   
λ(p, q) = = 1= 1,
p p
1≤x< 2 p
1≤x< 2 1≤y< qx
p 1≤y< q2 py p
q <x< 2

since qx/p ∈ Z when x < p. Also,


 
λ(q, p) = 1.
1≤y< q2 1≤x< py
q

It follows that
    
p−1 q−1
λ(p, q) + λ(q, p) = 1= ,
2 2
1≤y< q2 1≤x< p
2

since both p and q are odd. 

Example. The numbers 8783 and 15671 are prime. We want to determine the number of solutions of
the congruence x2 ≡ 8783 (mod 15671). We have
       
8783 ( 8783−1
)( 15671−1
) 15671 15671 6888
= (−1) 2 2 =− =−
15671 L 8783 L 8783 L 8783 L
 3         
2 3 7 41 2 3 7 41
=− =− .
8783 L 8783 L 8783 L 8783 L 8783 L 8783 L 8783 L 8783 L

Next, note that


 
2
= −(−1)[ 2 ]−[ 4 ] ,
8783 8783

8783 L
   
3 ( 3−1
)( 8783−1
) 8783
= (−1) 2 2 ,
8783 L 3 L
   
7 7−1 8783−1 8783
= (−1)( 2 )( 2 ) ,
8783 L 7 L
   
41 41−1 8783−1 8783
= (−1)( 2 )( 2 ) .
8783 L 41 L

It follows that
             2
8783 8783 8783 8783 2 5 9 9−1 5 3
=− =− = −(−1) 8
15671 3 L 7 L 41 L 3 L 7 L 41 L 7 L 41 L
L
       
5 ( 5−1
)( 7−1
) 7 7 2 25−1
= = (−1) 2 2 = = = (−1) 8 = −1.
7 L 5 L 5 L 5 L

Hence the congruence has no solutions.


Chapter 4 : Quadratic Residues 4–7

4.4. The Jacobi Symbol

To shorten many calculations involving the Legendre symbol, we introduce the Jacobi symbol which can
be considered in some way to be a generalization of the Legendre symbol. For every n ∈ Z, we write
n
= 1.
1 J

If m is a positive odd integer with canonical decomposition m = pu1 1 . . . pur r , where p1 , . . . , pr are distinct
odd primes, then we write
n r 
u
n i
= .
m J
i=1
pi L

Remark. We emphasize immediately that the Jacobi symbol is for calculation only. In particular,
note that
n
=1
m J

does not necessarily imply that the congruence x2 ≡ n (mod m) is soluble. Consider, for example, the
case when n = 2 and m = 15.

The following observations can be deduced from the properties of the Legendre symbol. We leave
the proof as an exercise for the reader.

THEOREM 4L. Suppose that m and m are odd positive integers. Then for every n, n ∈ Z, we have
 n   n   
nn
(i) = ;
m J m J m J
n n   n 
(ii) = ;
m J m J mm J
n  
n
(iii) = whenever n ≡ n (mod m); and
m J m J
 2  n
a n
(iv) = whenever (a, m) = 1.
m J m J

THEOREM 4M. Suppose that m is an odd positive integer. Then


   
−1 m−1 2 m2 −1
= (−1) 2 and = (−1) 8 .
m J m L

Proof. It is convenient to write m = p1 . . . ps , where the prime factors are not necessarily distinct.
Then


s 
s 
s 
s 
s
m= (1 + pj − 1) = 1 + (pj − 1) + (pj − 1)(pk − 1) + . . . ≡ 1 + (pj − 1) (mod 4),
j=1 j=1 j=1 k=1 j=1
j=k

and so

m − 1  pj − 1
s
≡ (mod 2).
2 j=1
2
4–8 W W L Chen : Elementary Number Theory

Thus
  s 

s
−1 −1 pj −1 m−1
= = (−1) 2 = (−1) 2 ,
m J j=1
pj L j=1

proving the first assertion. Similarly, we can write


s 
s 
s 
s 
s
m2 = (1 + p2j − 1) = 1 + (p2j − 1) + (p2j − 1)(p2k − 1) + . . . ≡ 1 + (p2j − 1) (mod 16),
j=1 j=1 j=1 k=1 j=1
j=k

and so

m2 − 1  p2j − 1
s
≡ (mod 2).
8 j=1
8

Thus
  s  
s p2 −1
2 2 j m2 −1
= = (−1) 8 = (−1) 8 ,
m J j=1
pj L j=1

proving the second assertion. 

We leave it as an exercise for the reader to prove the following reciprocity result.

THEOREM 4N. Suppose that m and n are odd positive integers and (m, n) = 1. Then
m  n  m−1
= (−1)( 2 )( n−1
2 ).
n J m J

Example. Let us consider our earlier example again. Recall that we want to determine the number
of solutions of the congruence x2 ≡ 8783 (mod 15671), where the numbers 8783 and 15671 are prime.
Omitting the details of a few steps from earlier, we have
     3      
8783 6888 2 861 861 861
=− =− =− =−
15671 8783 L 8783 L 8783 L 8783 L 8783 J
L
     
861−1 8783−1 8783 8783 173
= −(−1)( 2 )( 2 ) =− =−
861 J 861 J 861 J
     
173−1 861−1 861 861 −4
= −(−1)( 2 )( 2 ) =− =−
173 J 173 L 173 L
  2  
−1 2 −1 173−1
=− =− = −(−1) 2 = −1.
173 L 173 L 173 L

Alternatively, try to fill in the missing details in the argument below. We have
       
8783 15671 −1895 8783
=− =− =−
15671 8783 L 8783 L 1895 J
L
      
8783 8783 379 16
=− =− =− = −1.
5 J 379 J 33 J 33 J
Chapter 4 : Quadratic Residues 4–9

4.5. The Distribution of Quadratic Residues

Suppose that the prime p satisfies p ≡ 1 (mod 8(k!)), where k ∈ N. Then it is not difficult to see that 2
is a quadratic residue modulo p. Furthermore, for any odd prime q ∈ N such that q ≤ k and q = p, we
have
     
q p−1 q−1 p 1
= (−1)( 2 )( 2 ) = = 1,
p L q L q L

so that q is a quadratic residue modulo p. Suppose now that n ∈ N satisfies n ≤ k. Then all the prime
factors of n do not exceed k. It follows from Theorem 4D that n is a quadratic residue modulo p.

Now let np denote the least positive quadratic non-residue modulo p. For the prime p above, we
have np > k. It follows that

lim sup np = ∞.
p→∞

In 1919, Vinogradov conjectured that for any  > 0, we have np  p as p → ∞. Here we prove the
following weaker result.

THEOREM 4P. For every odd prime p, we have


 1/2
1 1
(8) np ≤ + p + .
2 4

Proof. Let h = [p/np ] + 1. Then p < hnp < p + np , so that (hnp /p)L = 1. Since (np /p)L = −1, it
follows from Theorem 4D that (h/p)L = −1. Note now that since 0 < h < p/2 + 1 < p, we must have
1 ≤ h < p, so that h ≥ np . We therefore conclude that

p p
np ≤ +1≤ + 1.
np np

The inequality (8) follows. 

Problems for Chapter 4

1. How many solutions does the congruence x2 ≡ 3 (mod 71) have?

2. (i) Show that 3 is a quadratic residue for primes of the form 12k ± 1 and a quadratic non-residue
for primes of the form 12k ± 5.
(ii) Deduce that −3 is a quadratic residue for primes of the form 6k + 1 and a quadratic non-residue
for primes of the form 6k − 1.
(iii) By considering x2 + 3, show that there are infinitely many primes of the form 6k + 1.

3. Suppose that p is an odd prime. Suppose further that the set {1, 2, . . . , p − 1} can be expressed as
the union of two non-empty subsets S and T such that
• S = T ;
• the product modulo p of any two elements in the same set lies in S; and
• the product modulo p of any element in S with any element in T lies in T .
Prove that S consists of the quadratic residues modulo p, and that T consists of the quadratic
non-residues modulo p.
4–10 W W L Chen : Elementary Number Theory

4. (i) Prove that 3 is a primitive root of any prime of the form 2n + 1, where n > 1 is an integer.
(ii) Prove that 2 is a primitive root of any prime of the form 2p + 1, where p is a prime of the form
4n + 1.
(iii) Prove that −2 is a primitive root of any prime of the form 2p + 1, where p is a prime of the
form 4n + 3.
(iv) Prove that 2 is a primitive root of any prime of the form 4p + 1, where p is a prime.
(v) Prove that 3 is a primitive root of any prime of the form 2n p + 1, where n > 1 is an integer
n
and the prime p > (32 − 1)/2n .

5. Suppose that q is an odd prime, and that p = 4q + 1 is also a prime.


(i) Prove that the congruence x2 ≡ −1 (mod p) has exactly two solutions, each of which is a
quadratic non-residue modulo p.
(ii) Prove that every quadratic non-residue modulo p is a primitive root modulo p, with the excep-
tion of the two quadratic non-residues in part (i).
(iii) Find all the primitive roots of 29.
 p−1
6. Suppose that n ∈ N and p is prime. Prove that if (n/p)L = −1, then d 2 ≡ 0 (mod p).
d|n

7. Suppose that the prime p ≡ 3 (mod 4), and that a is a quadratic residue modulo p. Show that the
p+1
solutions of the congruence x2 ≡ a (mod p) are given by x = ±a 4 (mod p).

8. Suppose that the prime p ≡ 5 (mod 8), and that a is a quadratic residue modulo p.
p−1
(i) Suppose that a 4 ≡ 1 (mod p). Show that the solutions of the congruence x2 ≡ a (mod p)
p+3
are given by x ≡ ±a 8 (mod p).
p−1
(ii) Suppose that a 4 ≡ −1 (mod p). Show that the solutions of the congruence x2 ≡ a (mod p)
p−1 p+3
are given by x ≡ ±2 4 a 8 (mod p).

9. Prove Theorem 4L.

10. Prove Theorem 4N.

11. Suppose that p is an odd prime. Suppose further that a, b ∈ Z and p  a. Prove that

p 
 
an + b
= 0.
n=1
p L

12. Suppose that p is an odd positive prime. Suppose further that k ∈ Z and p  k.
(i) Prove that

p−1 
  
p−1 
p−1   p−1 
 
n(n + k) n(n + knn ) 1 + kn
= = .
n=1
p L n=1 n =1
p L 
p L
n =1
nn ≡1 (mod p)

(ii) Deduce that

p−2 
  
p−2 
p−2   p−2 
 
n(n + k) n(n + knn ) 1 + kn
= = .
n=1
p L n=1 n =1
p L 
p L
n =1
nn ≡1 (mod p)
Chapter 4 : Quadratic Residues 4–11

13. Suppose that p is an odd prime.


(i) Let A(R, R) denote the number of integers n satisfying 1 ≤ n ≤ p − 2 such that both n and
n + 1 are quadratic residues modulo p. Show that

p−2           
1 n n+1 1 −1
A(R, R) = 1+ 1+ = p−4− .
4 n=1 p L p L 4 p L

(ii) Let A(R, N ) denote the number of integers n satisfying 1 ≤ n ≤ p − 2 such that n is a
quadratic residue and n + 1 is a quadratic non-residue modulo p. By first considering the sum
A(R, R) + A(R, N ), find A(R, N ).
(iii) Let A(N, R) denote the number of integers n satisfying 1 ≤ n ≤ p − 2 such that n is a
quadratic non-residue and n + 1 is a quadratic residue modulo p. By first considering the sum
A(R, R) + A(N, R), find A(N, R).
(iv) Hence determine A(N, N ), the number of integers n satisfying 1 ≤ n ≤ p − 2 such that both n
and n + 1 are quadratic non-residues modulo p.

14. Consider the sum


p  2
 
n −k
S(k) = ,
n=1
p L

where p is an odd prime, and k ∈ Z.


(i) Show that if p | k, then S(k) = p − 1.
(ii) Show that if p  k, then

p      
m−k m
S(k) = 1+ = −1.
m=1
p L p L

15. Consider the sum


p 
 
an2 + bn + c
T (a, b, c) = ,
n=1
p L

where p is an odd prime, and a, b, c ∈ Z satisfy p  a. By considering the sum


 
4a
T (a, b, c),
p L

show that
  
 a

− p if p  (b2 − 4ac),
L 
T (a, b, c) =

 a
 (p − 1) if p | (b2 − 4ac).
p L

16. Suppose that f (x) is a polynomial with integer coefficients. Suppose further that a, b ∈ Z and p is
a prime, and R denotes a complete set of residues modulo p.
(i) Suppose that (a, p) = 1. Prove that

  f (ax + b)    f (x) 
= .
p L p L
x∈R x∈R
4–12 W W L Chen : Elementary Number Theory

(ii) Prove that


  af (x)    
a f (x)

= .
p L p L p L
x∈R x∈R

(iii) Suppose that (a, p) = 1. Prove that

  ax + b 
= 0.
p L
x∈R

(iv) Suppose that (a, p) = (b, p) = 1. Prove that

p−1 
   
x(ax + b) a
=− .
x=1
p L p L

[Hint: Begin by multiplying the sum by (4a/p)L .]

17. Suppose that the primes p ≡ 1 (mod 4) and q ≡ 3 (mod 4). Prove each of the following:
p−1  
 n
(i) n = 0.
n=1
p L

p−1
p(p − 1)
(ii) n= .
n=1
4
(n/p)L =1

 
q−1 q−1  

n n
(iii) n2
=q n .
n=1
q L n=1
q L
    
3p  2 n
p−1 p−1
3 n
(iv) n = n .
n=1
p L 2 n=1 p L

q−1   
q−1   
q−1  
4 n 3 n 2 n
(v) n = 2q n −q 2
n .
n=1
q L n=1
q L n=1
q L
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 5
SUMS OF INTEGER SQUARES

5.1. Sums of Two Squares

In this section, we shall characterize all natural numbers which are representable as the sum of two
integer squares. In other words, we shall determine all numbers n ∈ N such that the equation
n = x21 + x22
is soluble in x1 , x2 ∈ Z.

The first step in our argument is provided by the following result on the special case when n is a
prime congruent to 1 modulo 4.

THEOREM 5A. (FERMAT) Suppose that p is prime and p ≡ 1 (mod 4). Then p is representable
as the sum of two integer squares; in other words, there exist x1 , x2 ∈ Z such that p = x21 + x22 .

We shall first give the original proof by Fermat using his method of descent. In the next section, we
shall give an alternative proof by Thue which contains ideas that we can develop further to study the
number of representations of a natural number as a sum of two integer squares.

Proof of Theorem 5A. Since p ≡ 1 (mod 4), it follows from Theorem 4C that (−1/p)L = 1, and so
−1 is a quadratic residue modulo p. The numbers
p−1 p−1
− , . . . , −1, 0, 1, . . . ,
2 2
form a complete set of residues modulo p. It follows that one of the elements, x0 say, satisfies x20 + 1 ≡ 0
(mod p). Since |x0 | < p/2, we must have
 p 2
p ≤ x20 + 1 < + 1 < p2 .
2
5–2 W W L Chen : Elementary Number Theory

In particular, there exists m ∈ N satisfying 1 ≤ m < p such that mp can be expressed as a sum of two
integer squares. It now suffices to show that the least positive multiple of p which can be expressed as
a sum of two integer squares must be p itself.

We shall prove this by showing that if mp, where 1 < m < p, is a sum of two integer squares, then
there exists m0 ∈ N satisfying 1 ≤ m0 < m such that m0 p is also a sum of two integer squares. Suppose
now that 1 < m < p and

(1) mp = x21 + x22 ,

where x1 , x2 ∈ Z. We define y1 , y2 ∈ Z by writing


m m
(2) − < y1 < and y1 ≡ x1 (mod m)
2 2
and
m m
(3) − < y2 < and y2 ≡ x2 (mod m).
2 2
In view of (1), we have y12 + y22 ≡ x21 + x22 ≡ 0 (mod m), so there exists m0 ∈ Z such that

(4) y12 + y22 = mm0 .

Combining (2)–(4), we have


 m 2  m 2 m2
mm0 ≤ + = ,
2 2 2
so that m0 < m. On the other hand, we must have m0 = 0, for otherwise y1 = y2 = 0, so that
x1 ≡ x2 ≡ 0 (mod m), and so m2 | (x21 + x22 ), whence m | p, contradicting that 1 < m < p. We therefore
must have 1 ≤ m0 < m. Combining (1) and (4), we now have

(5) m0 pm2 = (x21 + x22 )(y12 + y22 ) = (x1 y1 + x2 y2 )2 + (x1 y2 − x2 y1 )2 .

By (1)–(3), we have

x1 y1 + x2 y2 ≡ x21 + x22 ≡ 0 (mod m)

and

x1 y2 − x2 y1 ≡ x1 x2 − x2 x1 ≡ 0 (mod m).

It follows that each term on the right hand side of (5) is divisible by m2 , so that
 2  2
x1 y1 + x2 y2 x1 y2 − x2 y1
m0 p = + ,
m m

and the proof is complete. 

We now determine all the natural numbers which are sums of two integer squares.

THEOREM 5B. Suppose that n ∈ N and n > 1, and the canonical decomposition of n is given by

n = 2r pr11 . . . prkk q1s1 . . . qs ,

where the integer r ≥ 0 and r1 , . . . , rk , s1 , . . . , s ∈ N, and p1 , . . . , pk , q1 , . . . , q ∈ N are primes satisfying


p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4). Then n is a sum of two integer squares if
and only if s1 , . . . , s are all even.
Chapter 5 : Sums of Integer Squares 5–3

Proof. Suppose first of all that n = x21 + x22 , where x1 , x2 ∈ Z. Then

(6) x21 + x22 ≡ 0 (mod q1 ).

Suppose on the contrary that s1 is odd. If q1  x2 , then it follows from Theorem 3H that there exists
x ∈ Z such that x2 x ≡ 1 (mod q1 ). Multiplying (6) by x2 gives (x1 x)2 ≡ −1 (mod q1 ), impossible since
−1 is a quadratic non-residue modulo q1 . It follows that q1 | x2 , and so q1 | x1 also. Writing x1 = q1 y1
and x2 = q1 y2 , we have n = q12 (y12 +y22 ). Hence s1 ≥ 3. Repeating the argument on n/q12 yields s1 −2 ≥ 3.
Repeating the argument a sufficient number of times leads eventually to a contradiction. It follows that
s1 must be even. A similar argument shows that s2 , . . . , s are all even.

The converse follows from the identity

(7) (x21 + x22 )(y12 + y22 ) = (x1 y1 + x2 y2 )2 + (x1 y2 − x2 y1 )2

on noting that we can apply Theorem 5A to each of the primes p1 , . . . , pk , that 2 is a sum of two integer
squares, and that qj2 = qj2 + 02 is a sum of two integer squares for every j = 1, . . . , . 

5.2. Number of Representations

A natural question that arises concerns the number of ways any given n ∈ N can be represented as a
sum of two integer squares. Our starting point is the following alternative proof of Fermat’s theorem by
Thue.
 p−1 
Second Proof of Theorem 5A. Let x = 2 !. Since 4 | (p − 1), it follows that (p − 1)/2 is an
even integer, and so
p−1 p−1 p−1
p−1 
2 
2 
2
2
x = (−1) 2 2
r = r(−r) ≡ r(p − r) = (p − 1)! ≡ −1 (mod p)
r=1 r=1 r=1

by Wilson’s theorem. Hence the congruence

(8) x2 + 1 ≡ 0 (mod p)

is soluble. We shall now show that if x ∈ Z is a solution of (8), then there exist a, b ∈ Z such that

(9) |a| < p1/2 , |b| < p1/2 , ab = 0 and ax ≡ b (mod p).

If (9) holds, then 0 < a2 + b2 < 2p and

a2 + b2 ≡ a2 + (ax)2 = a2 (1 + x2 ) ≡ 0 (mod p),

so that a2 + b2 = p. To prove (9), consider the numbers of the form ux − v, where u, v ∈ Z satisfy
0 ≤ u, v ≤ p1/2 . There are ([p1/2 ] + 1)2 > p choices of such numbers u and v, and only p residue classes
modulo p. It follows from Dirichlet’s box principle that there exist two such pairs u , v  and u , v  such
that u x − v  and u x − v  belong to the same residue class modulo p and so are congruent to each other
modulo p. Now let a = u − u and b = v  − v  . Then ax − b = (u − u )x − (v  − v  ) ≡ 0 (mod p).
Clearly we have |a| < p1/2 and |b| < p1/2 . Finally, if b = 0, then we must have a ≡ 0 (mod p), and so
a = 0, a contradiction. Hence b = 0. Similarly a = 0. 

Our first step towards finding a formula for the number of representations of a natural number as
a sum of two integer squares is the following generalization of the above proof of Fermat’s theorem.
5–4 W W L Chen : Elementary Number Theory

THEOREM 5C. Suppose that n ∈ N and n > 1. For every solution x ∈ Z of the congruence
x2 + 1 ≡ 0 (mod n), there exist unique positive integers a, b ∈ N such that

(a, b) = 1, a2 + b2 = n and ax ≡ b (mod n).

Proof. By considering numbers of the form ux − v, where u, v ∈ Z satisfy 0 ≤ u, v ≤ n1/2 , we can


show as before that there exist non-zero numbers α, β ∈ Z such that

α2 + β 2 = n and αx ≡ β (mod n).

Clearly, we may assume without loss of generality that α > 0.

If β > 0, then we let a = α and b = β. Clearly a2 + b2 = n and ax ≡ b (mod n).

If β < 0, then we let a = −β and b = α. Again we have a2 + b2 = n. On the other hand, we have
bx ≡ −a (mod n), so that bx2 ≡ −ax (mod n). It now follows from the assumption x2 ≡ −1 (mod n)
that ax ≡ b (mod n).

To show that (a, b) = 1, note that there exist k,  ∈ Z such that

x2 + 1 = kn and b = ax + n.

It follows that

n = a2 + b2 = a2 + (ax + n)2 = a2 (1 + x2 ) + axn + (ax + n)n


= a2 kn + axn + bn = (a(ak + x) + b)n,

and so a(ak + x) + b = 1, whence (a, b) = 1.

Finally, to show uniqueness, suppose that the conclusion holds also for the pair A, B ∈ N. Then

n2 = (a2 + b2 )(A2 + B 2 ) = (aA + bB)2 + (aB − bA)2 .

It follows that 0 < aA + bB ≤ n. On the other hand, note that

aA + bB ≡ aA + aAx2 = aA(1 + x2 ) ≡ 0 (mod n).

We therefore must have aA + bB = n, and so aB − bA = 0. Since (a, b) = (A, B) = 1, we must therefore


have a = A and b = B. 

THEOREM 5D. Suppose that n ∈ N, and T (n) is equal to the number of solutions of the congruence
x2 + 1 ≡ 0 (mod n). Then the number of solutions of the equation n = a2 + b2 with (a, b) = 1 is equal
to 4T (n).

Proof. Suppose first of all that n = 1. Clearly T (1) = 1 and the equation 1 = a2 + b2 has four
solutions, namely (a, b) = (±1, 0) and (a, b) = (0, ±1).

Suppose now that n > 1. We have already shown for every solution x ∈ Z of the congruence
x2 + 1 ≡ 0 (mod n), there exist unique positive integers a, b ∈ N such that

(a, b) = 1, a2 + b2 = n and ax ≡ b (mod n).

Conversely, suppose that a, b ∈ N satisfy (a, b) = 1 and n = a2 + b2 . It is easy to see that (a, n) = 1, and
so the congruence ax ≡ b (mod n) has unique solution.
Chapter 5 : Sums of Integer Squares 5–5

The above establishes a one-to-one correspondence between the solutions of the congruence x2 +1 ≡ 0
(mod n) and numbers a, b ∈ N such that (a, b) = 1 and n = a2 + b2 . The factor 4 occurs if we permit
negative values for a and b. 

THEOREM 5E. Suppose that n ∈ N, and T (n) is equal to the number of solutions of the congruence
x2 + 1 ≡ 0 (mod n). Then T (n) = 0 if 4 | n or if n is divisible by a prime q ≡ 3 (mod 4). Otherwise we
have T (n) = 2k , where k is the number of distinct odd prime factors of n.

Proof. Clearly the result is valid if n = 1, so we shall assume that n > 1. Using the Chinese
remainder theorem, one can show that T (n) is a multiplicative function. It follows that if the canonical
decomposition of n is given by

n = 2r pr11 . . . prkk q1s1 . . . qs ,

where the integer r ≥ 0 and r1 , . . . , rk , s1 , . . . , s ∈ N, and p1 , . . . , pk , q1 , . . . , q ∈ N are primes satisfying


p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4), then

T (n) = T (2r )T (pr11 ) . . . T (prkk )T (q1s1 ) . . . T (qs ).

It is easy to check that T (2) = 1. On the other hand, the congruence x2 ≡ −1 (mod 4) has no
solutions, and so the congruence x2 ≡ −1 (mod 2r ) has no solutions for any r ≥ 2. Hence T (2r ) = 0 for
every r ≥ 2, and so T (n) = 0 if 4 | n.

Suppose next that q ∈ N is a prime satisfying q ≡ 3 (mod 4). Since −1 is a quadratic non-residue
modulo q, it follows that the congruence x2 ≡ −1 (mod q) has no solutions, and so the congruence
x2 ≡ −1 (mod q s ) has no solutions for any s ≥ 1. Hence T (q s ) = 0 for every s ≥ 1, and so T (n) = 0 if
q | n.

To complete the proof, it suffices to show that for every prime p satisfying p ≡ 1 (mod 4), we have
T (pr ) = 2 for every r ≥ 1. Suppose that r ∈ N is fixed. Then any solution of the congruence x2 ≡ −1
(mod pr ) can be assumed to be an element in the set

R = {x ∈ N : 0 < x < pr and p  x}.

Now any x ∈ R must satisfy the congruence x2 ≡ m (mod pr ) for some number m ∈ N satisfying
0 < m < pr and (m/p)L = 1. There are 12 (p − 1) numbers m ∈ N satisfying 0 < m < p and (m/p)L = 1,
and so there are 12 (p − 1)pr−1 = 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr and (m/p)L = 1. Suppose
now that x2 ≡ y 2 (mod pr ) and p  x. Then pr | (x + y)(x − y), so that p | (x + y) or p | (x − y);
but not both, for otherwise p must divide their sum 2x, a contradiction. It follows that pr | (x + y) or
pr | (x−y), and so x ≡ ±y (mod pr ). Hence for each of the 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr
and (m/p)L = 1, there are at most two numbers x ∈ R such that x2 ≡ m (mod pr ). Since R contains
precisely φ(pr ) elements, it follows that for each of the 12 φ(pr ) numbers m ∈ N satisfying 0 < m < pr
and (m/p)L = 1, there are precisely two numbers x ∈ R such that x2 ≡ m (mod pr ). Note now that
−1 is a quadratic residue modulo p. It follows that there are precisely two numbers x ∈ R such that
x2 ≡ −1 (mod pr ), and so T (pr ) = 2. 

THEOREM 5F. Suppose that n ∈ N, and S(n) is equal to the number of solutions of the equation
n = a2 + b2 in numbers a, b ∈ Z. Then
 n
S(n) = 4 T 2 ,
d
d |n
2

where for every n ∈ N, the number T (n) is equal to the number of solutions of the congruence x2 + 1 ≡ 0
(mod n).
5–6 W W L Chen : Elementary Number Theory

Proof. Suppose that a, b ∈ Z satisfy n = a2 + b2 . Write d = (a, b). Then d2 | n. If we write a1 = a/d
and b1 = b/d, then the pair a1 , b1 satisfy
n
(10) = a21 + b21 and (a1 , b1 ) = 1.
d2
On the other hand, suppose that d2 | n, and that a1 , b1 ∈ Z satisfy (10). If we write a = da1 and b = db1 ,
then (a, b) = d and n = a2 + b2 . We can therefore identity any pair a, b ∈ Z satisfying n = a2 + b2 with
the pair a1 , b1 ∈ Z satisfying (10), where d = (a, b). The result now follows on noting Theorem 5D. 

THEOREM 5G. Suppose that n ∈ N, and S(n) is equal to the number of solutions of the equation
n = a2 + b2 in numbers a, b ∈ Z. Then

S(n) = 4 χ(m),
m|n

where χ : N → R is the non-principal character modulo 4, defined for every m ∈ N by



0 if m ≡ 0 (mod 2),
χ(m) = 1 if m ≡ 1 (mod 4),

−1 if m ≡ 3 (mod 4).

Proof. For every n ∈ N, write



χ(m) = W (n).
m|n

It is easy to show that the function χ(n) is multiplicative, so it follows from Theorem 2A that the
function W (n) is also multiplicative. On the other hand, recall that the function T (n) is multiplicative.
It follows from Theorem 5F, in a way similar to the proof of Theorem 2A, that if (n1 , n2 ) = 1, then

S(n1 n2 ) S(n1 ) S(n2 )


= .
4 4 4
To complete the proof, it therefore suffices to show that for any prime p and any r ∈ N, we have

S(pr )
= W (pr ),
4
since the result is obvious for n = 1.

Consider first of all

S(pr )   pr 
= T .
4 d2
d |p
2 r

If r is even, then

S(p )r 1 if p = 2,
= T (pr ) + T (pr−2 ) + . . . + T (p2 ) + T (1) = 1 if p ≡ 3 (mod 4),
4 
r + 1 if p ≡ 1 (mod 4).

If r is odd, then

S(pr ) 1 if p = 2,
= T (pr ) + T (pr−2 ) + . . . + T (p) = 0 if p ≡ 3 (mod 4),
4 
r + 1 if p ≡ 1 (mod 4).
Chapter 5 : Sums of Integer Squares 5–7

Hence

1 if p = 2,
S(p )  1
r
if p ≡ 3 (mod 4) and r is even,
=
4 0
 if p ≡ 3 (mod 4) and r is odd,
r + 1 if p ≡ 1 (mod 4).

Consider next

W (pr ) = χ(pr ) + . . . + χ(p) + 1.

It is easy to show that χ(pu ) = (χ(p))u for every u ∈ N. Hence



 1 if p = 2,

1 if p ≡ 3 (mod 4) and r is even,
W (pr ) = (χ(p))r + . . . + χ(p) + 1 =

 0 if p ≡ 3 (mod 4) and r is odd,
r + 1 if p ≡ 1 (mod 4).

This completes the proof. 

5.3. Sums of Four Squares

We now study the problem of representing natural numbers as sums of four integer squares, and show
that this is always possible.

THEOREM 5H. (LAGRANGE) Every n ∈ N is representable as the sum of four integer squares;
in other words, for every n ∈ N, there exist x1 , x2 , x3 , x4 ∈ Z such that n = x21 + x22 + x23 + x24 .

Proof. In view of the identity

(11) (x21 + x22 + x23 + x24 )(y12 + y22 + y32 + y42 )


2 2
= (x1 y1 + x2 y2 + x3 y3 + x4 y4 ) + (x1 y2 − x2 y1 + x3 y4 − x4 y3 )
2 2
+ (x1 y3 − x3 y1 − x2 y4 + x4 y2 ) + (x1 y4 − x4 y1 − x3 y2 + x2 y3 ) ,

it suffices to show that every prime can be expressed as a sum of four integer squares. Clearly we have
2 = 12 + 12 + 02 + 02 . Also, it follows from Theorem 5A that every prime p ≡ 1 (mod 4) is a sum of four
integer squares. It therefore remains to prove that every prime q ≡ 3 (mod 4) is a sum of four integer
squares. Naturally the number 1 is a quadratic residue modulo q. Let a ∈ N be the smallest number in
the range 1 ≤ a ≤ q − 2 such that a + 1 is a quadratic non-residue modulo q, so that
   
a+1 a
= −1 and = 1.
q L q L

Since q ≡ 3 (mod 4), it follows from Theorem 4C that (−1/q)L = −1, and so
    
−a − 1 −1 a+1
= = 1.
q L q L q L

In other words, there exist integers x1 and x2 in the complete set

q−1 q−1
− , . . . , −1, 0, 1, . . . ,
2 2
5–8 W W L Chen : Elementary Number Theory

of residues modulo q such that x21 ≡ a (mod q) and x22 ≡ −a − 1 (mod q). Hence

x21 + x22 + 12 + 02 ≡ 0 (mod q)

and
 q 2
q ≤ x21 + x22 + 1 < 2 + 1 < q2 .
2
In particular, there exists m ∈ N satisfying 1 ≤ m < q such that mq can be expressed as a sum of four
integer squares. It now suffices to show that the least positive multiple of q which can be expressed as
a sum of four integer squares must be q itself.

We shall prove this by showing that if mq, where 1 < m < q, is a sum of four integer squares, then
there exists m0 ∈ N satisfying 1 ≤ m0 < m such that m0 q is also a sum of four integer squares. Suppose
now that 1 < m < q and

(12) mq = x21 + x22 + x23 + x24 ,

where x1 , x2 , x3 , x4 ∈ Z. If m is even, then the right hand side of (12) must be even. It follows that an
even number of the four terms x1 , x2 , x3 , x4 must be even, so we may assume without loss of generality
that x1 ≡ x2 (mod 2) and x3 ≡ x4 (mod 2). In particular, we have
 2  2  2  2
m x1 + x2 x1 − x2 x3 + x4 x3 − x4
q= + + + .
2 2 2 2 2

We can therefore assume that m is odd. For i = 1, 2, 3, 4, let yi ∈ Z satisfy


m m
(13) − < yi < and yi ≡ xi (mod m).
2 2
In view of (12) and (13), we have

y12 + y22 + y32 + y42 ≡ x21 + x22 + x23 + x24 ≡ 0 (mod m),

so there exists m0 ∈ Z such that

(14) y12 + y22 + y32 + y42 = mm0 .

Combining (13) and (14), we have


 m 2
mm0 < 4 = m2 ,
2
so that m0 < m. On the other hand, we must have m0 = 0, for otherwise yi = 0 for every i = 1, . . . , 4,
so that xi ≡ 0 (mod m), and so m2 | (x21 + x22 + x23 + x24 ), whence m | q, contradicting that 1 < m < q.
We therefore must have 1 ≤ m0 < m. Combining (11), (12) and (14), we now have
2 2
(15) m0 qm2 = (x1 y1 + x2 y2 + x3 y3 + x4 y4 ) + (x1 y2 − x2 y1 + x3 y4 − x4 y3 )
2 2
+ (x1 y3 − x3 y1 − x2 y4 + x4 y2 ) + (x1 y4 − x4 y1 − x3 y2 + x2 y3 ) .

By (12) and (13), we have

x1 y1 + x2 y2 + x3 y3 + x4 y4 ≡ x21 + x22 + x23 + x24 ≡ 0 (mod m).

Also each of the terms (x1 y2 −x2 y1 +x3 y4 −x4 y3 ), (x1 y3 −x3 y1 −x2 y4 +x4 y2 ) and (x1 y4 −x4 y1 −x3 y2 +x2 y3 )
is congruent to 0 modulo q. It follows that each term on the right hand side of (15) is divisible by m2 ,
so that m0 q is a sum of four integer squares. 
Chapter 5 : Sums of Integer Squares 5–9

5.4. Sums of Three Squares

The situation is very different in the case of three integer squares. The main reason is that there is no
analogue of (7) and (11) in this case. Here, we shall only concern ourselves with the following simple
theorem.

THEOREM 5J. No integer of the form 4k (8m + 7), where k, m ∈ N ∪ {0}, can be represented as a
sum of three squares.

Proof. Note first of all that every integer square is congruent to 0, 1, 4 modulo 8, so that (8m + 7) is
never a sum of three squares for any m ∈ Z. Hence the conclusion of Theorem 5J is true for k = 0.

We now proceed by induction on k. Suppose that 4s (8m + 7) is never a sum of three squares for
any m ∈ Z. We shall show that 4s+1 (8m + 7) is never a sum of three squares for any m ∈ Z. Suppose
on the contrary that

4s+1 (8m + 7) = x21 + x22 + x23 ,


where x1 , x2 , x3 ∈ Z. Since x2 ≡ 0 (mod 4) if x is even and x2 ≡ 1 (mod 4) if x is odd, it follows that
each of x1 , x2 , x3 must be even, so that
 x 2  x  2  x  2
1 2 3
4s (8m + 7) = + + ,
2 2 2
a contradiction. 

It was proved by Legendre in 1798 that all other natural numbers are representable as sums of three
integer squares.

Problems for Chapter 5

1. Suppose that the natural number n has canonical decomposition n = 2r pr11 . . . prkk q1s1 . . . qs , where
the integer r ≥ 0 and r1 , . . . , rk , s1 , . . . , s ∈ N, and p1 , . . . , pk , q1 , . . . , q are primes satisfying
p1 ≡ . . . ≡ pk ≡ 1 (mod 4) and q1 ≡ . . . ≡ q ≡ 3 (mod 4). Suppose further that m = pr11 . . . prkk .
(i) Show that the function S(n) defined in Theorem 5F satisfies

0 if at least one of s1 , . . . , s is odd,
S(n) =
4d(m) otherwise.
(ii) Suppose that s1 , . . . , s are all even. Show that the number of solutions of the equation
n = x2 + y 2 in x, y ∈ Z satisfying x ≥ y ≥ 0,
is equal to the number of solutions of the equation
m = xy in x, y ∈ Z satisfying x ≥ y > 0.
[Hint: Show that both are equal to [ 12 d(m) + 12 ].]

2. Suppose that n ∈ N. Show that if the equation n = x2 + y 2 has more than one solution in x, y ∈ N
with x even, then n is composite.

3. Show that for every real number X > 0, the function S(n) defined in Theorem 5F satisfies the
inequalities

πX − 4X 1/2 − 4 < S(n) < πX + 4X 1/2 .
n≤X

4. We know that if each of two non-negative integers is a sum of two squares of non-negative integers,
then so is their product. Show that the analogous assertion for three squares is false.
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 6
ELEMENTARY PRIME NUMBER THEORY

6.1. Euclid’s Theorem Revisited

We have already seen the elegant and simple proof of Euclid’s theorem, that there are infinitely many
primes. Here we shall begin by proving a slightly stronger result.

THEOREM 6A. The series


1

p
p

is divergent.

Proof. For every real number X ≥ 2, write

  1
−1
PX = 1− .
p
p≤X

Then
  
1
log PX = − log 1 − = S1 + S2 ,
p
p≤X

where

 1 ∞
 1
S1 = and S2 = .
p hph
p≤X p≤X h=2
6–2 W W L Chen : Elementary Number Theory

Since

 ∞

1 1 1
0≤ ≤ = ,
hp h p h p(p − 1)
h=2 h=2

we have

 ∞

1 1
0 ≤ S2 ≤ ≤ = 1,
p
p(p − 1) n=2 n(n − 1)

so that 0 ≤ S2 ≤ 1. On the other hand, we have


∞ 
  1  1
PX = ≥ →∞ as X → ∞.
ph n
p≤X h=0 n≤X

The result follows. 

For every real number X ≥ 2, we write



π(X) = 1,
p≤X

so that π(X) denotes the number of primes in the interval [2, X]. This function has been studied
extensively by number theorists, and attempts to study it in depth have led to major developments in
other important branches of mathematics.

As can be expected, many conjectures concerning the distribution of primes were made based purely
on numerical evidence, including the celebrated Prime number theorem, proved in 1896 by Hadamard
and de la Vallée Poussin, that

π(X) log X
lim = 1.
X→∞ X
We shall not prove this in these lectures. Instead we shall be concerned with the weaker result of
Tchebycheff, that there exist positive absolute constants c1 and c2 such that for every real number
X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X

6.2. The Von Mangoldt Function

The study of the function π(X) usually involves, instead of the characteristic function of the primes, a
function which counts not only primes, but prime powers as well, and with weights. Accordingly, we
introduce the von Mangoldt function Λ : N → C, defined for every n ∈ N by writing

log p if n = pr , with p prime and r ∈ N,
Λ(n) =
0 otherwise.

THEOREM 6B. For every n ∈ N, we have



Λ(m) = log n.
m|n
Chapter 6 : Elementary Prime Number Theory 6–3

Proof. The result is clearly true for n = 1, so it remains to consider the case n ≥ 2. Suppose that
n = pu1 1 . . . pur r is the canonical decomposition of n. Then the only non-zero contribution to the sum
v
on the left hand side comes from those natural numbers m of the form m = pj j with j = 1, . . . , r and
1 ≤ vj ≤ uj . It follows that

 
r 
uj

r
u
Λ(m) = log pj = log pj j = log n.
m|n j=1 vj =1 j=1

This completes the proof. 

THEOREM 6C. As X → ∞, we have

 
X
Λ(m) = X log X − X + O(log X).
m
m≤X

Proof. It follows from Theorem 6B that


     
X
log n = Λ(m) = Λ(m) 1= Λ(m) .
m
n≤X n≤X m|n m≤X n≤X m≤X
m|n

It therefore suffices to prove that



(1) log n = X log X − X + O(log X) as X → ∞.
n≤X

To prove (1), note that log X is an increasing function of X. In particular, for every n ∈ N, we have

n+1
log n ≤ log u du,
n

so that

X
log n − log(X + 1) ≤ log u du.
n≤X 1

On the other hand, for every n ∈ N, we have



n
log n ≥ log u du,
n−1

so that

 
[X]
X
X
X
log n = log n ≥ log u du = log u du − log u du ≥ log u du − log X.
n≤X 2≤n≤X 1 1 [X] 1

The inequality (1) now follows on noting that



X
log u du = X log X − X + 1.
1

This completes the proof. 


6–4 W W L Chen : Elementary Number Theory

6.3. Tchebycheff ’s Theorem

The crucial step in the proof of Tchebycheff’s theorem concerns obtaining bounds on sums involving the
von Mangoldt function. More precisely, we prove the following result.

THEOREM 6D. There exist positive absolute constants c3 and c4 such that
 1
(2) Λ(m) ≥ X log 2 if X ≥ c3 ,
2
m≤X

and

(3) Λ(m) ≤ c4 X if X ≥ 0.
X
2 <m≤X

Proof. If m ∈ N satisfies X/2 < m ≤ X, then clearly [X/2m] = 0. It follows from this and Theorem
6C that as X → ∞, we have
       
X X X X
Λ(m) −2 = Λ(m) −2 Λ(m)
m 2m m 2m
m≤X m≤X m≤ X
 
2

X X X
= (X log X − X + O(log X)) − 2 log − + O(log X) = X log 2 + O(log X).
2 2 2
Hence there exists a positive absolute constant c5 such that for all sufficiently large X, we have
   
1 X X
X log 2 < Λ(m) −2 < c5 X.
2 m 2m
m≤X

We now consider the function [α] − 2[α/2]. Clearly [α] − 2[α/2] < α − 2(α/2 − 1) = 2. Note that the
left hand side is an integer, so we must have [α] − 2[α/2] ≤ 1. It follows that for all sufficiently large X,
we have
1 
X log 2 < Λ(m).
2
m≤X

The inequality (2) follows. On the other hand, if X/2 < m ≤ X, then [X/m] = 1 and [X/2m] = 0, so
that for all sufficiently large X, we have

Λ(m) ≤ c5 X.
X
2 <m≤X

The inequality (3) follows easily. 

We now state and prove Tchebycheff’s theorem.

THEOREM 6E. (TCHEBYCHEFF) There exist positive absolute constants c1 and c2 such that for
every real number X ≥ 2, we have
X X
c1 < π(X) < c2 .
log X log X

Proof. To prove the lower bound, note that


     
log X
Λ(m) = log p = (log p) 1= (log p) ≤ π(X) log X.
p,n log p
m≤X p≤X 1≤n≤[ log X
] p≤X
pn ≤X log p
Chapter 6 : Elementary Prime Number Theory 6–5

It follows from (2) that


X log 2
π(X) ≥ if X ≥ c3 .
2 log X

Since π(2) = 1, we get the lower bound for a suitable choice of c1 .

To prove the upper bound, note that in view of (3) and the definition of the von Mangoldt function,
the inequality
 X
log p ≤ c4
X
2j
<p≤ Xj
2j+1 2

holds for every integer j ≥ 0 and every real number X ≥ 0. Suppose that X ≥ 2. Let the integer k ≥ 0
be defined such that 2k < X 1/2 ≤ 2k+1 . Then

 
k  
k
log p ≤ log p ≤ c4 X 2−j < 2c4 X,
X 1/2 <p≤X j=0 X
<p≤ Xj j=0
2j+1 2

so that
  log p 4c4 X
1≤ 1/2
< ,
log X log X
X 1/2 <p≤X X 1/2 <p≤X

whence
4c4 X c2 X
π(X) ≤ X 1/2 + <
log X log X

for a suitable c2 . 

6.4. Some Results of Mertens

We conclude this chapter by obtaining an improvement of Theorem 6A.

THEOREM 6F. (MERTENS) As X → ∞, we have


 Λ(m)
(4) = log X + O(1),
m
m≤X

 log p
(5) = log X + O(1),
p
p≤X

and
 1
(6) = log log X + O(1).
p
p≤X

Proof. Recall Theorem 6C. As X → ∞, we have


 
X
Λ(m) = X log X − X + O(log X).
m
m≤X
6–6 W W L Chen : Elementary Number Theory

Clearly [X/m] = X/m + O(1), so that as X → ∞, we have


 
   Λ(m) 
X
Λ(m) =X +O Λ(m) .
m m
m≤X m≤X m≤X

It follows from (3) that

 ∞
 
Λ(m) ≤ Λ(m) ≤ 2c4 X,
m≤X j=0 X
<m≤ Xj
2j+1 2

so that as X → ∞, we have
 Λ(m)
X = X log X + O(X).
m
m≤X

The inequality (4) follows. Next, note that


 Λ(m)  log p  log p   1
= k
= + (log p) .
m p p pk
m≤X p,k p≤X p≤X 2≤k≤ log X
log p
pk ≤X

As X → ∞, we have

   ∞
  log p ∞

1 1 log n
(log p) ≤ (log p) = ≤ = O(1).
p k p k p(p − 1) n=2 n(n − 1)
p≤X 2≤k≤ log X p≤X k=2 p≤X
log p

The inequality (5) follows. Finally, for every real number X ≥ 2, let
 log p
T (X) = .
p
p≤X

Then it follows from (5) that there exists a positive absolute constant c6 such that |T (X) − log X| < c6
whenever X ≥ 2. On the other hand,

X 
X
 1  log p 1 dy T (X) T (y) dy
= + 2 = +
p≤X
p
p≤X
p log X p y log y log X 2 y log2 y

X
X
T (X) − log X (T (y) − log y) dy dy
= + 2 +1+ .
log X 2 y log y 2 y log y

It follows that as X → ∞, we have


 
 
X
 1 
 − log log X  < c6 + c6 dy
+ 1 − log log 2 = O(1).
 p  log X log2 y
p≤X  2 y

The inequality (6) follows. 

Problems for Chapter 6



1. Prove that Λ(n) + µ(m) log m = 0 for every n ∈ N.
m|n
Chapter 6 : Elementary Prime Number Theory 6–7

2. For any arithmetic function f , we define f  to be the arithmetic function given by f  (n) = f (n) log n
for every n ∈ N. Then for the arithmetic function U defined by U (n) = 1 for every n ∈ N, we have
U  (n) = log n and U  (n) = log2 n for every n ∈ N.
(i) Suppose that f and g are arithmetic functions.
(I) Prove that (f + g) = f  + g  and (f ∗ g) = (f  ∗ g) + (f ∗ g  ).
(II) Suppose that f (1) = 0. By noting that (f ∗ f −1 ) (n) = 0 for every n ∈ N, prove that
(f −1 ) = −f  ∗ (f ∗ f )−1 .
(ii) Explain why Λ ∗ U = U  . Then establish Selberg’s identity Λ + (Λ ∗ Λ) = U  ∗ µ.
  1
−1
3. Prove that for every real number X ≥ 2, we have 1− > log X.
p
p≤X

4. Use the well-known inequality


t
< log(1 + t) < t, where t > −1 and t = 0,
1+t
to show that
 1  1
> log log X and > log log X − 1.
p−1 p
p≤X p≤X

5. Suppose that
• λn is an increasing sequence of real numbers with limit infinity;
• cn is an arbitrary sequence of real or complex numbers; and
• f has continuous derivative for X ≥ λ1 .
For every X ≥ λ1 , let

C(X) = cn .
λn ≤X

Establish the partial summation formula, that for every X ≥ λ1 , we have



X
cn f (λn ) = C(X)f (X) − C(y)f  (y) dy.
λn ≤X λ1

6. Use Theorem 6F and partial summation to show that as X → ∞, we have



X  1
π(y)
2
dy = + o(1) ∼ log log X.
2 y p
p≤X

7. Derive the Prime number theorem, that


X
π(X) ∼ as X → ∞,
log X
from the hypothetical relation

log p ∼ X as X → ∞,
p≤X

and the information



X  
dy X X
= +o as X → ∞.
2 log y log X log X

 1
8. Show that the series converges as X → ∞.
p log p
p≤X
ELEMENTARY NUMBER THEORY
W W L CHEN


c W W L Chen, 1981, 2003.
This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990.
It is available free to all individuals, on the understanding that it is not to be used for financial gains,
and may be downloaded and/or photocopied, with or without permission from the author.
However, this document may not be kept on any information storage and retrieval system without permission
from the author, unless such system is not accessible to any individuals other than its owners.

Chapter 7
GAUSS SUMS AND QUADRATIC RECIPROCITY

7.1. Gauss Sums

Recall the Law of quadratic reciprocity, that if p and q are distinct odd primes, then
  
q p p−1 q−1
= (−1)( 2 )( 2 ) .
p L q L

There are many proofs of this – Gauss alone discovered six. Our aim here, however, is to give a second
proof of this result, a proof discovered by Dirichlet and based on ideas from Fourier series.

Throughout this chapter, we use the notation that e(y) = e2πiy for every y ∈ R.

Suppose that q ∈ N and a ∈ Z satisfy (a, q) = 1. The Gauss sum

q  2
ax
S(q, a) = e
x=1
q

has many interesting properties, the first of which is a multiplicative property which simplifies its eval-
uation to cases when q is a prime power.

THEOREM 7A. Suppose that q1 , q2 ∈ N satisfy (q1 , q2 ) = 1. Suppose further that a ∈ Z satisfies
(a, q1 q2 ) = 1. Then

S(q1 q2 , a) = S(q1 , q2 a)S(q2 , q1 a).


7–2 W W L Chen : Elementary Number Theory

Proof. Since (q1 , q2 ) = 1, it follows from Theorem 3E that as x1 and x2 run through complete sets
of residues modulo q1 and q2 respectively, q2 x1 + q1 x2 runs through a complete set of residues modulo
q1 q2 . Hence
 2
q
1 q2 q1  q2  
az a(q2 x1 + q1 x2 )2
S(q1 q2 , a) = e = e .
z=1
q1 q2 x =1 x =1
q1 q2
1 2

Note now that (q2 x1 + q1 x2 )2 ≡ q22 x21 + q12 x22 (mod q1 q2 ). It follows that

q1       q2  
aq2 x21 
q2 q1
aq2 x21 aq1 x22 aq1 x22
S(q1 q2 , a) = e + = e e .
x =1 x =1
q1 q2 x =1
q1 x =1
q2
1 2 1 2

The result follows. 

The Law of quadratic reciprocity can be deduced from Theorem 7A and the two results below.

THEOREM 7B. Suppose that p is an odd prime, and a ∈ Z satisfies (a, p) = 1. Then
 
a
(1) S(p, a) = S(p, 1).
p L

THEOREM 7C. Suppose that q ∈ N is odd. Then S(q, 1) = q q 1/2 , where



1 if q ≡ 1 (mod 4),
q =
i if q ≡ −1 (mod 4).

To deduce the Law of quadratic reciprocity, note that by Theorems 7A and 7B, we have, for distinct
primes p, q ∈ N, that
   
q p
S(pq, 1) = S(p, q)S(q, p) = S(p, 1) S(q, 1).
p L q L

It follows from Theorem 7C that


  
q p S(pq, 1) pq
= = .
p L q L S(p, 1)S(q, 1) p q

Note now that the right hand side has value −1 if p ≡ q ≡ −1 (mod 4) and value 1 otherwise. The Law
of quadratic reciprocity follows.

Proof of Theorem 7B. Consider the congruence x2 ≡ n (mod p). Clearly the number of solutions
of this congruence is given by 1 + (n/p)L , so that

p  2  p       p    
ax n an n an
S(p, a) = e = 1+ e = e ,
x=1
p n=1
p L p n=1
p L p

since

p  
an
e = 0.
n=1
p
Chapter 7 : Gauss Sums and Quadratic Reciprocity 7–3

We now make the substitution an ≡ m (mod p), and note that as n runs through a complete set of
residues modulo p, so does m. Hence, denoting by a−1 the natural number satisfying 1 ≤ a−1 < p and
aa−1 ≡ 1 (mod p), we have

p  −1     −1  p        p    
a m m a m m a m m
(2) S(p, a) = e = e = e .
m=1
p L p p L m=1 p L p p L m=1 p L p

In particular, putting a = 1 in (2), we obtain

p    
m m
(3) S(p, 1) = e .
m=1
p L p

The identity (1) now follows on combining (2) and (3). 

To complete the proof of the Law of quadratic reciprocity, it remains to establish Theorem 7C,
which we shall do in Section 7.3. As the proof involves ideas concerning the convergence of Fourier
series, we shall first make a very brief study of this in the next section.

7.2. Convergence of Fourier Series

Suppose that a function f : R → C is Riemann integrable over the interval [0, 1] and is periodic with
period 1. We define the Fourier coefficient ch , for every h ∈ Z, by
 1
ch = ch (f ) = f (y)e(−hy) dy.
0

The formal series




ch (f )e(hy)
h=−∞

is called the Fourier series of the function f .

Our task here is to obtain sufficient conditions for the Fourier series of a given function f to converge
to f , or at least some function closely related to f . The basic theorem in this study is the Riemann-
Lebesgue lemma.

THEOREM 7D. (RIEMANN-LEBESGUE LEMMA) Suppose that a, b ∈ R and a < b. Suppose


further that the function f : [a, b] → R is Riemann integrable over the interval [a, b]. For any number
λ ∈ R, let
 b
I(λ, f ) = f (y)eiλy dy.
a

Then I(λ, f ) → 0 as λ → ∞.

Proof. Our first task is to approximate f in [a, b] by a step function. Let  > 0 be given. For any
sufficiently large k ∈ N, there exists a dissection

∆k : a = y0 < y1 < . . . < yk = b

of [a, b] such that the upper sum S(f, ∆k ) and the lower sum s(f, ∆k ) satisfy

0 ≤ S(f, ∆k ) − s(f, ∆k ) < .


7–4 W W L Chen : Elementary Number Theory

For every y ∈ [a, b], define



sup{f (y) : y ∈ [yj−1 , yj ]} if y ∈ (yj−1 , yj ],
fk (y) =
fk (y1 ) if y = y0 .

Clearly fk is a step function in, and hence Riemann integrable over, the interval [a, b]. Furthermore,
f (y) ≤ fk (y) for all y ∈ [a, b]. It follows immediately that fk − f is Riemann integrable over [a, b], and
that |fk (y) − f (y)| = fk (y) − f (y) for all y ∈ [a, b]. Hence
 b  b  b
|I(λ, fk ) − I(λ, f )| = |fk (y) − f (y)| dy = (fk (y) − f (y)) dy = S(f, ∆k ) − f (y) dy < .
a a a

On the other hand,

k 
 
k
yj
eiλyj − eiλyj−1
I(λ, fk ) = fk (yj )e iλy
dy = fk (yj ) →0
j=1 yj−1 j=1

as λ → ∞, so that |I(λ, fk )| <  for all sufficiently large λ. It follows that |I(λ, f )| < 2 for all sufficiently
large λ. 

We now establish a result concerning the convergence of a Fourier series.

THEOREM 7E. Suppose that a function f : R → C is Riemann integrable over the interval [0, 1]
and is periodic with period 1. Let y ∈ R. Suppose that the limits

f (y±) = lim f (y + δ)
δ→0±

and

 f (y + δ) − f (y±)
f± (y) = lim
δ→0± δ

all exist, and the functions



 f (u) − f (y±) − f  (y)

if u = y
±
(4) g± (u) = u−y


0 if u = y

are Riemann integrable over [y, y + 1/2] and [y − 1/2, y] respectively. Then


H
f (y+) + f (y−)
lim ch (f )e(hy) = .
H→∞ 2
h=−H

Proof. In view of periodicity, we can write


 y+1/2
ch (f ) = f (u)e(−hu) du.
y−1/2

For every H ∈ N, let


H
SH = ch (f )e(hy).
h=−H
Chapter 7 : Gauss Sums and Quadratic Reciprocity 7–5

Then
 y+1/2 
H
SH = f (u) e(h(y − u)) du.
y−1/2 h=−H

Simple calculations give




H  sin π(2H + 1)(y − u)

if u = y,
(5) e(h(y − u)) = sin π(y − u)


h=−H
2H + 1 if u = y.

Note that the right hand side of (5) is continuous and Riemann integrable. It follows that SH = I1 + I2 ,
where
 0
sin π(2H + 1)v
I1 = f (y + v) dv
−1/2 sin πv

and
 1/2
sin π(2H + 1)v
I2 = f (y + v) dv.
0 sin πv

Consider now the integral I1 . Clearly it follows from (4) that


 0
v
I1 = g− (y + v) sin π(2H + 1)v dv
−1/2 sin πv
 0  0
 v sin π(2H + 1)v
+ f− (y) sin π(2H + 1)v dv + f (y−) dv.
−1/2 sin πv −1/2 sin πv

By Theorem 7D, the first two integrals on the right hand side both converge to 0 as H → ∞. The last
term is equal to
 
  
0 H
1 H
1 − (−1)h 
f (y−) e(hv) dv = f (y−)  +  = 1 f (y−).
2 2πih  2
−1/2 h=−H h=−H
h=0

Similarly I2 → 12 f (y+) as H → ∞. 

7.3. Proof of Theorem 7C

Let q ∈ N be odd. For every real number θ ∈ [0, 1], let

q−1 
 
(n + θ)2
f (θ) = e ,
n=0
q

and note that S(q, 1) = f (0) = f (1). The function f has the Fourier series



ch e(hy),
h=−∞
7–6 W W L Chen : Elementary Number Theory

where, for every h ∈ Z, the coefficient


 1 
q−1  q−1  n+1  2
 
(n + θ)2 φ
ch = e e(−hθ) dθ = e − hφ e(hn) dφ
0 n=0 q n=0 n
q
 q  2   1
φ
= e − hφ dφ = q e(qθ2 − qhθ) dθ
0 q 0
  1   2     1−h/2
qh2 h qh2
= qe − e q θ− dθ = qe − e(qθ2 ) dθ.
4 0 2 4 −h/2

By Theorem 7E, the Fourier series converges to f in [0, 1], so that


N 
N
(6) S(q, 1) = lim ch e(h0) = lim ch .
N →∞ N →∞
h=−N h=−N

If h is even, then −qh2 /4 ∈ Z. It follows that as N → ∞, we have


2N 
2N    1−h/2 
2N  1−h/2
qh2
(7) ch = qe − e(qθ2 ) dθ = q e(qθ2 ) dθ
4 −h/2 −h/2
h=−2N h=−2N h=−2N
h even h even h even
 N +1  ∞
=q e(qθ2 ) dθ → q e(qθ2 ) dθ = q 1/2 I,
−N −∞

where
 ∞
I= e(θ2 ) dθ.
−∞

If h is odd, then h2 ≡ 1 (mod 4), and so qh2 ≡ q (mod 4). It follows that


2N 
2N   1−h/2  q 
2N  1−h/2
qh2
(8) ch = qe − e(qθ ) dθ = qe −
2
e(qθ2 ) dθ
4 −h/2 4 −h/2
h=−2N h=−2N h=−2N
h odd h odd h odd
 q  N +1/2  q
= qe − e(qθ2 ) dθ → q 1/2 e − I as N → ∞.
4 −N +1/2 4

Combining (6)–(8), we have


  q 
(9) S(q, 1) = q 1/2 1 + e − I.
4

Putting q = 1 in (9), we have 1 = (1 − i)I. Hence

q 1/2 (1 + e(−q/4))
S(q, 1) = .
1−i

Theorem 7C follows easily.


Errata: A Computational Introduction to Number Theory and Algebra
(Version 1)

Last updated: 10/27/2005.

p. 29: Exercise 2.18. Both instances of “[±1]n ” in the hint should be replaced by “[±1]p .”
Actually, it would be better if the hint consisted of just the reference to Exercise 2.5,
and the exercise itself were moved to §2.2.
[VS, 10/12/2005]
p. 73: Last paragraph. Replace the second sentence by:
Using the theory of continued fractions, Theorem 4.6 can be improved as
follows: if n > 2r∗ t∗ (rather than n ≥ 4r∗ t∗ ), then statement (ii) of the
theorem holds when r0 is chosen as the first remainder ri ≤ r∗ , and s0 :=
si , t0 := ti . This fact was observed by Wang, Guy, and Davenport [97].
p. 89: Line 15. “(pe11 · · · perr )s ” should be “(pe11 · · · perr )−s .”
[VS, 5/17/2005]
p. 98: Exercise 6.1(a). Clarification: n is chosen at random from the set {2k−1 , . . . , 2k −1}.
[VS, 9/15/2005]
p. 99: Exercises 6.3 and 6.5. It may be better to add the hint: use induction on n.
[VS, 10/11/2005]
p. 107: Theorem 6.4. Add “where the distribution of each πi is Di .”
Also, the notation “D1 × · · · × Dn ” was never actually defined.
[VS, 9/17/2005]
p. 130: Line −5. “measure” should be “measure of.”
[VS, 5/18/2005]
p. 122: Line −9. “modulo n” should be “over Zn .”
[VS, 8/24/2005]
Pk0
should be “ ki=1
P 0
p.143: Second line in proof of Theorem 6.23. “ i=1 ai ” Pr[ai ].”
[VS, 10/5/2005]
p. 146: Last paragraph of §6.10.4. Rewrite as:
The definition of conditional expectation carries over verbatim. Equations (6.15) and
(6.16) hold (assuming the relevant expectations exist). Also, the analog of (6.16) holds
for infinite partitions B1 , B2 , . . . , provided E[X] exists.
[VS, 7/28/2005]

1
p. 171: Line −13. Insert “using” before “algorithm.”
[VS, 6/13/2005]
p. 194: Line 6 of Example 8.36. Replace both instances of “a” by “z” (not a typo, but it
would read better).
[VS, 5/20/2005]
p. 210: Lemma 8.46. “such that mi | mi+1 ” should read “such that mi | mi+1 and
ni | ni+1 .”
[VS, 3/30/2005]
p. 229: Equation (9.3). “Yj ” should be “Yj .”
[VS, 4/23/2005]
p. 230: Line 4. Add parenthetical remark just before semi-colon: “(with distinct exponent
sequences among the monomials).”
[VS, 4/23/2005]
p. 230: Line 7. “degree” should be “total degree.”
[VS, 4/23/2005]
p. 231: Line 3 of Example 9.32. “polynomial” should be “polynomials.”
[VS, 5/27/2005]
p. 271: Line −6. Before the semi-colon, insert: “with which we can perform both table
insertions and lookups in time O(len(p)).”
[VS, 9/22/2005]
p. 272: Line 16. Insert “of” after “table.”
[VS, 9/22/2005]
p. 279: Exercise 11.14. It might be better to ask the reader to give a rigorous proof,
assuming Conjecture 5.24, and assuming p is a random prime chosen between 3 and
Z with Z ≥ Y 3 .
[VS, 8/27/2005]
p. 331: Line −2. Replace “0V ” by “0W .”
[VS, 9/11/2005]
p. 332: Lines 4 and 7. Replace “0V ” by “0W .”
[VS, 9/11/2005]
p. 332: Line 7. Replace “equivalent to saying” to “implied by the condition.”
For the other direction, all one can say that if W contains a non-zero self-orthogonal
vector, then there exists a subspace U of W such that U ∩ Ū 6= {0W }.
[Ronald Cramer, 9/11/2005]

2
p. 441: 2nd to last para. Delete “It is easy to see that.”
[VS, 4/19/2005]

p. 477: Last paragraph. The claim that the running-time bound in Theorem 21.8 is tight
is incorrect (and in particular, Exercise 21.10 should be deleted).
In fact, assuming the gcd operation is implemented using Euclid’s algorithm, Algo-
rithm SFD uses O(`2 + `(w − 1) len(p)/p) operations in F . This follows from the
fact that on inputs a, b ∈ F [X] deg(a) ≥ deg(b) ≥ 0, Euclid’s algorithm uses only
O(len(b) len(a/d)) operations in F , where d := gcd(a, b) (this could be made an ex-
ercise for both the integer and polynomial cases). Combining this fact with Exer-
cise 21.24 will yield (with a careful counting argument) the better `2 bound for SFD,
instead of the more naive `3 bound. The algorithms in §21.6 are still useful, as the
output of these algorithms are in a nicer form.
[VS, 9/25/2005]

p. 482: Exercise 21.11. Add the following: “Assume that computing M1 (β) for β ∈
F [X]/(h) takes Ω(deg(h)2 len(q)) operations in F .”
[VS, 9/26/2005]

You might also like