Lecture 8

Elementary number theory

Now we have language to do so, the rest of this course will be concerned with beginning a study of the sets of numbers we have discussed earlier: N\mathbb{N}, Z\mathbb{Z}, Q\mathbb{Q} and R\mathbb{R}.

We’re going to spend two-thirds of that time (or thereabouts) laying the foundations for elementary number theory: the study of N\mathbb{N} and Z\mathbb{Z}.

This used to be a beautiful, isolated and useless subject, until the 20th century came along. Now it is beautiful, well-connected and vitally important.

In the sense that mathematicians use the word, “elementary” doesn’t mean “easy”: it means “using no deep theory” (we’re only four weeks into your first semester, so haven’t had time to develop any deep theory). It can still be difficult, and in fact it can still be deep.

Divisibility and primes

The most obvious way to start investigating properties of N\mathbb{N} and Z\mathbb{Z} is to ask about division. We remarked a while ago that it’s not always possible to do division inside Z\mathbb{Z} or N\mathbb{N}: that suggests there’s something interesting going on!

Here’s the basic definition:
Definition: Let aa and bb be integers. We say that aa divides bb if there exists an integer nn such that an=ban=b.

We also might say that bb is a multiple of aa, or that aa is a divisor of bb, or that aa is a factor of bb, or that aa goes into bb.

In symbols, we write aba\mid b to say that aa divides bb, and write aba\nmid b to say that aa does not divide bb.

For example, 91=7×1391 = 7\times13, so we have 7917\mid 91. Also, 91=(7)×(13)91=(-7)\times(-13), so we have 791-7\mid91. Also, 91=7×(13)-91 = 7\times(-13), so we have 7917\mid -91.

However, 77 cannot be written as an integer multiple of 9191, so we have 91791\nmid 7.

What does it mean to say that aa does not divide bb? Well, it means:

there does not exist any integer nn, such that an=ban=b,

or (equivalently)

for all nZn\in\mathbb{Z}, we have anban\neq b.

It’s worth sorting out the trivial cases:

For the next few lectures, we’ll be studying the integers from the point of view of divisibility.

The following definition is a natural one:
Definition: An integer p>1p>1 is said to be prime if it has no positive factors except for 11 and pp itself.

Primes are clearly a good thing to study: they’re the numbers with no complicated factors.

It’s good to have a word meaning roughly the same thing as “not prime”:
Definition: An integer n>1n>1 is said to be composite if it is not prime: that is, if it does have positive factors other than 11 and nn.

Notice that we have chosen our definitions so that 11 will be neither prime nor composite. This was a choice, and it seems a bit mysterious at first.

Indeed, until the late 19th century, mathematicians treated 11 as prime. But it was found to be so much simpler to do it this way that nobody considers 11 to be prime any more.

The main thing about primes is that all other positive integers are built from them by multiplication.

Before we get to that, it’s worth explaining something about multiplication.

You’re all used to the fact that the sum of an empty list of numbers is zero. I want to persuade you that the product of an empty list of numbers is one.

Suppose I’m trying to find 2×3×4×52\times 3\times 4\times 5.

One way of doing this would be to ask one person to find 2×32\times 3 and another to find 4×54\times 5, and then multiply the results: (2×3)×(4×5)=6×20=120.(2\times 3)\times(4\times 5) = 6\times 20 = 120. I could split the work up in other ways: (2×4)×(3×5)=8×15=120(2×3×4)×(5)=24×5=120\begin{aligned} {(2\times 4)\times(3\times 5)} &{= 8\times 15 = 120}\\ {(2\times 3\times 4)\times(5)} &{= 24\times 5 = 120}\end{aligned} What if I give all the numbers to the first person? (2×3×4×5)×(?)=120×?=120(2\times 3\times 4\times 5) \times (?) = 120\times ? = 120 For the right answer, the product of no numbers must be 1.


Every positive integer nn can be written as a product of primes (in at least one way).


We’ll prove this by strong induction on nn.

For our base case, we observe that when n=1n=1, we can write nn as the product of no primes.

So now we have to do our induction step: let kk be a positive integer. We assume that every positive integer ii with 1i<k1\leq i<k can be written as a product of primes, and we try to prove that kk can.

Now, either kk is prime, or it is composite. If it is prime, then kk is the product of just one prime (namely, kk itself).

If, however, kk is composite, then it has a positive integer factor aa which is not 11 nor kk itself: in other words, we have k=abk = ab, where both aa and bb are between 11 and kk. By the induction hypothesis, aa and bb can both be written as products of primes, say: a=p1p2pm,andb=q1q2qn.a = p_1p_2\cdots p_m,\quad\text{and}\quad b = q_1q_2\cdots q_n. But then k=ab=p1p2pmq1q2qnk = ab = p_1p_2\cdots p_mq_1q_2\cdots q_n, which proves it for kk. That completes the induction step (and the proof).

Later on, we’ll prove a stronger result, that every number can be written as a product of primes in exactly one way (rearranging the factors doesn’t count). That’s much, much harder.

Because of this we can be sure that primes are reasonably important. The first few are:

2,3,5,7,11,13,17,19,23,29,31,37,41,43,472, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47

What are sensible questions to ask? Here are some obvious examples:

  1. How many primes are there?

  2. There’s quite a lot of primes between 11 and 5050. Do they tend to get rarer as we go on?

  3. Other than 22 and 55, all primes must end in 11, 33, 77 or 99. Is there a bias: do more end in 33 than in 99, for example?

  4. There seem to be several pairs of small primes which differ by 22 (eg 33 and 55, and 55 and 77, and 1111 and 1313). How many such pairs are there?

  1. Are there quick ways of testing if a number is prime?

  2. Are there quick ways of finding large primes?

Some of these have had well-known answers for more than a century, some are still unsolved, and some are currently the focus of tremendous interest.

We’ll start off by giving the answer to that first question, which was known to the ancient Greeks:


[Euclid’s theorem] There are infinitely many prime numbers.

Here’s the proof, the way I prefer to think of it:


We’ll construct a sequence p1,p2,p3p_1,p_2,p_3\ldots of different primes by induction (so, the statement we’re doing induction on is, “there are at least nn different primes”).

For our base case we take n=1n=1, and then take p1=2p_1=2, which is a prime.

For our induction step we suppose we have primes p1,,pnp_1, \ldots, p_n, and our job is to show that there’s another prime.

Consider the natural number p1p2pn+1p_1p_2\cdots p_n + 1 obtained by multiplying all our primes so far and adding 11.

This number is not a multiple of p1p_1, because p1pnp_1\cdots p_n is: so p1pn+1p_1\cdots p_n+1 leaves a remainder of 11 when you divide by p1p_1.

Similarly, it’s not a multiple of pip_i for any i=1,,ni=1,\ldots,n, because p1pnp_1\cdots p_n is, and so p1pn+1p_1\cdots p_n+1 leaves a remainder of 11 upon division by pip_i.

But this number has at least one prime factor: we can take our next prime pn+1p_{n+1} to be one such prime factor, and that completes the induction step.

Here’s pretty much exactly the same proof, phrased in a slightly different way.


[Proof (of the same theorem again)] We prove this by contradiction: we show that it’s true by showing that the negation is absurd.

Indeed, suppose there were only finitely many primes, p1,,pnp_1,\ldots,p_n. Then consider (as before) the natural number p1pn+1.p_1\cdots p_n+1. This isn’t divisible by any of the primes p1,,pnp_1,\ldots,p_n (since it leaves a remainder of 11 upon division by any of them). But that’s absurd, since we were assuming those were all the primes, and that every number can be written as a product of primes.

Some people find proof by contradiction slightly startling when they see it first.

In fact, it’s perfectly familiar in daily life. When you find someone who disagrees with you, you show that you are right by pointing out that if you were wrong, then that would contradict something well-known to be correct.

From a logical perspective, it’s all to do with the . Suppose PP is some result we desperately want to prove, for example P=“there are infinitely many primes”,P = \text{``there are infinitely many primes''}, and TT something we know is true, for example T=“every positive integer has a prime factor”.T = \text{``every positive integer has a prime factor''}. (That was Theorem ).

Now, we proved that if there are only finitely many primes, then some number doesn’t have a prime factor. That’s exactly ¬P¬T\neg P\Rightarrow\neg T. But that means that its contrapositive TPT\Rightarrow P is true. And once we know that, then, since we know TT is true, we also know PP is true.

The second form above, the proof by contradiction, is a more standard form. It appears in the majority of textbooks (and maybe the majority of mathematicians’ minds).

This makes me sad, because it’s not as good. The proof by contradiction spends all its time making fun of the idea that there might not be infinitely many primes; the first one just goes and builds them.

There are (quite a lot of) other proofs of Euclid’s theorem, but Euclid himself probably only knew the way we’ve discussed.

That means that you can actually use the first proof to construct primes:

This is genuinely a way of producing primes. Admittedly, it’s not a very intelligent one.

If you have to find primes, it’s probably better to use this method, which works well in practice:


[The Sieve of Eratosthenes] The Sieve of Eratosthenes proceeds by writing down the natural numbers from 22 up to NN (for some NN) in a convenient form. We repeat the following steps:

  1. Find the first untouched number and mark it as a prime.

  2. Mark all its multiples as being composite.

The Sieve of Eratosthenes doesn’t prove that there are infinitely many primes: it just finds them. Unless we’d found a proof of Euclid’s theorem, we could have nightmares that one day we’ll find ourself crossing off all the remaining natural numbers and not finding any more primes.