Thursday 25 May 2017

Ghosts of departed quantities

The problem with infinitesimals

In the early development of the calculus, both Newton and Leibniz made fruitful use of infinitesimal quantities, without really being able to give a satisfactory account of just what they were. Bishop Berkeley famously pointed out the incoherence of the notion, referring to them with the excellent barb the ghosts of departed quantities. Over the next century or so, calculus developed apace, but it wasn't until much later that the likes of Weierstrass managed to sort out the idea of a limit properly, and laid the foundations for the well-known "\(\epsilon\text{-}\delta\)" approach to analysis that generations of mathematics undergraduates have since had to come to grips with. Once a detailed axiomatic presentation of the reals was developed, it was established to everybody's satisfaction that there really was no space for infinitesimal quantities among the real numbers, but that the \(\epsilon\text{-}\delta\) approach gave the appropriate language for understanding everything. That is, until Abraham Robinson found a way to put resurrect the approach in the early 1960s, inventing nonstandard analysis, and providing a framework in which infinitesimal and infinite quantities could be consistently worked with.

What I want to do here is to give at least an idea of how it works, and how it finesses the fact that the real numbers are, indeed, uniquely determined by being a complete, ordered field. I'll try not to say anything outright untrue, though I'll gloss over a huge amount of detail.

So first, let's remember how the real numbers arise when we fill in the gaps between the rational numbers.

We say that a sequence of rational numbers \(q_n\) is a Cauchy sequence if the elements get closer together in a well-controlled manner: in particular, given any tolerance \(\epsilon \gt 0\), there is some \(N\) such that all elements of the sequence \(q_n\) for \(n>N\) are closer together than \(\epsilon\). For some such sequences, there will be a rational limit, but for others there won't be. So we decide that if a sequence looks as if it ought to converge but has no rational limit, it does converge after all, but to something which lives in the gaps between the rational numbers: an irrational number. If we also decide that two rational sequences \(q_n\) and \(Q_n\) converge to the same limit iff \(q_n-Q_n\) converges to \(0\), then we can think of \(q_n\) and \(Q_n\) as different representations of the same irrational number. This lets us fill in the gaps in a way which fits in nicely with the arithmetic of the rational numbers.

The trick is to use sequences of the numbers we already have in order to define some new ones. A good trick can be used again, and this is no exception. We will make our extended real numbers, the nonstandard reals, from sequences of real numbers.

We can't use quite the same trick, though. Part of what the completion of the rational numbers to the real numbers does is to give us a number system where trying to apply exactly the same trick again doesn't give anything new: any Cauchy sequence of real numbers already converges to a real number. So we have to adapt the idea in a new way.

Constructing the nonstandard reals

The idea is to think of a sequence that tends to \(0\) as somehow representing an infinitesimal number, and a sequence that grows without bound as somehow representing an infinite number. The real trick is in the criterion that tells us when two sequences have the same limit, and so can be thought of as representing the same quantity.

The missing ingredient is a way of assigning a measure to any set of positive integers. We call it \(m\), and require it to have the following properties:
  1. \(m(\mathbb{N})=1\).
  2. If \(K \subset \mathbb{N}\) is finite, then \(m(K)=0\).
  3. If \(K \subseteq \mathbb{N}\) then \(m(K)=0\) or \(m(K)=1\).
  4. If \(K,L \subset \mathbb{N}\) such that \(K\cap L = \emptyset\) then \(m(K \cup L)=m(K)+m(L)\).
Then we say that any statement which holds for a set of integers of measure \(1\) holds almost everywhere. Using this, we can treat any sequence of real numbers \(a_n\) as having (more exactly, defining) a limit. If \(a_n\) has limit \(\alpha\) and \(b_n\) has limit \(\beta\), then
  • \(\alpha = \beta\) if \(a_n=b_n\) almost everywhere.
  • \(\alpha \lt \beta\) if \(a_n \lt b_n\) almost everywhere.
  • \(\alpha \gt \beta\) if \(a_n \gt b_n\) almost everywhere.
It follows from the above requirements on \(m\) that for any two sequences \(a_n\) and \(b_n\), exactly one of these is the case.

Arithmetic is carried out in the obvious way: in terms of the above sequences, \(\alpha+\beta\) is the limit of \(a_n+b_n\), \(\alpha \beta\) is the limit of \(a_n b_n\), and so on. This all works, in the sense that it doesn't matter which representative sequences you choose, you get the same result. We call this set of numbers the nonstandard reals, denoted \({}^*\mathbb{R}\).

With this in place, we say that an element of \({}^*\mathbb{R}\) is infinitesimal if it lies between \(-a\) and \(a\) for all positive real \(a\), is finite if it lies between \(-a\) and \(a\) for some positive real, and is infinite if it lies between \(-a\) and \(a\) for no positive real \(a\). Clearly, \(1/n\) gives an infinitesimal number, and \(n\) gives an infinite one. An infinite number arising from a sequence of integers is called an infinite integer, so in this second case we actually have an infinite integer.

Then we get the following consequences: all look reasonable, though some take a little more proving than others.
  1. \({}^*\mathbb{R}\) is an ordered field.
  2. The normal real numbers (which we can now call standard real numbers) \(\mathbb{R}\) live inside \({}^*\mathbb{R}\). If \(a \in \mathbb{R}\), then the constant sequence \(a,a,a,\ldots\) represents \(a\) in \({}^*\mathbb{R}\).
  3. Any finite nonstandard real \(x\) is the sum of a standard real number, denoted \(\mbox{st}(x)\) plus an infinitesimal. We call \(\mbox{st}(x)\) the standard part of the nonstandard real number.
  4. If \(\epsilon\) is infinitesimal, then \(1/\epsilon\) is infinite, and vice-versa.
  5. The product of a (non-zero) finite number and an infinitesimal is infinitesimal.
  6. A function \(f:\mathbb{R}\to\mathbb{R}\) naturally determines \({}^*f:{}^*\mathbb{R} \to {}^*\mathbb{R}\): if \(\alpha \in {}^*\mathbb{R}\) corresponds to the sequence \(a_n\), then \({}^*f(\alpha)\) corresponds to \(f(a_n)\).
But in addition to all this, we have a much less obvious, but very powerful result.

The transfer principle: Any statement which does not involve the notion of standard part is equally true in \(\mathbb{R}\) and in \({}^*\mathbb{R}\).

In other words, every theorem about the standard real numbers has an analogue in the nonstandard real numbers. This is subtler than it might seem, and I'll return to it later.

So, we have a way of extending the standard real number system to include infinitesimal (smaller than any positive real) and infinite (larger than any positive real) quantities. Why would we bother with this? In other words...

What do we get for our money?

We can (and should) think of working with infinitesimals as a way of working with sequences that tend to zero, and working with infinite numbers as a way of working with sequences that diverge to infinity. The point of all this is to give a way of dealing efficiently with these sequences, so that we don't have to work with them explicitly.

Here is a sampler of how nonstandard analysis can be used to give an alternative, and perhaps more intuitive, picture of some aspects of standard real analysis.

It's useful to have a notation for when two numbers differ by an infinitesimal quantity: we write \(a \approx b\) if \(a-b\) is infinitesimal.

Continuity

The usual way of saying that \(f\) is continuous at \(a\) is to say that we can make \(f(x)\) as close as we want to \(f(a)\) by making \(x\) sufficiently close to \(a\), or, equivalently, that if \(x_n\) is any sequence tending to \(a\), then \(f(x_n)\) tends to \(f(a)\).

But then we can see that this latter is coded in the language of nonstandard analysis as saying that if \(\epsilon\) is any infinitesimal, then \({}^*f(x+\epsilon) \approx f(x)\). This gives a precise sense to the notion that changing the input to a continuous function by an infinitesimal amount changes the value by an infinitesimal amount.

Example

Consider \(f(x)=x^2\). Then \({}^*f(x+\epsilon)=x^2+2\epsilon x + \epsilon^2\), which differs from \(x^2\) by an infinitesimal amount, so \(f\) is continuous.

Differentiation

\(f\) is differentiable at \(x\) with derivative \(L\) if, whenever \(\epsilon\) is infinitesimal, then \[\frac{{}^*f(x+\epsilon)-{}^*f(x)}{\epsilon} \approx L. \]

Important note: this is not saying that the quotient is the derivative, but that it differs from it by an infinitesimal.

Example

Again, we consider \(f(x)=x^2\). Then \(({}^*f(x+\epsilon)-{}^*f(x))/\epsilon = 2x+\epsilon\), so \(f'(x)=2x\).

Integration

We want to calculate \(\int_a^b f(x) dx\). Then we choose an infinite integer \(N\), and split up the interval \([a,b]\) into \(N\) equal strips of width \(\epsilon = (b-a)/N\), and calculate the sum \[ S = \sum_{i=1}^N \epsilon f(a+i\epsilon) \] If \(S \approx I\) for some real \(I\), then \(I=\int_a^b f(x) dx\).

Example

To calculate \(\int_0^1 x dx\) we let \(N\) be infinite, so in this case we have the associated infinitesimal \(\epsilon = 1/N\). So \(a=0\), \(b=1\) and \(f(x)=x\), so \[ \sum_{i=1}^N \epsilon f(a+i\epsilon) = \sum_{i=1}^N \epsilon \times (i \epsilon) = \epsilon^2 \sum_{i=1}^N i = \epsilon^2 \frac{1}{2}N(N+1) = \frac{1}{2}+ \frac{\epsilon}{2} \approx \frac{1}{2} \] so that \[ \int_0^1 x dx = \frac{1}{2}. \] Again, the integral isn't the sum: they differ by an infinitesimal quantity.

Et cetera

One can then use nonstandard arguments to prove the usual theorems of introductory analysis such as the intermediate value theorem, define partial derivatives and multiple integrals, solve differential equations using an analogue to a finite difference method but now with an infinitesimal step length, and so on.

The point is that these and many other operations we do in calculus and analysis can be replaced by more intuitive notions that look like ordinary algebra. There is no free lunch, of course: one has to establish properly that the nonstandard reals really do behave themselves, and that the intuitive notion of a derivative as a quotient, or an integral as a sum, really do match up to the standard notions. Which leads to the next question.

Is it worth it?

As long as we understand that what we get for our money is not access to theorems which could not be proven by standard means, but an alternative approach to proving these theorems, then it does seem to be worth it. There's a small but active community of people who use nonstandard analysis to investigate problems in pure and applied mathematics, and who have gained valuable insights from it. It is, admittedly, a matter of efficiency rather than possibility; but the process of setting up the nonstandard framework does a lot of heavy digging once and for all, so that it doesn't have to be repeated on each occasion when it is needed. This can lead to a sufficiently streamlined approach to a problem that a previously intractable problem becomes practically attainable via nonstandard means. The standard interpretation in terms of sequences can then be recovered, and (if one finds it necessary) a standard proof reverse-engineered.

Devilish details

There are, of course, many details which I've glossed over here, and many devils reside in them. The subject has its subtleties, which I'll try to indicate a couple of here.

The first subtlety goes right back to the way of measuring sets of integers which was used to decide when two sequences converge to the same nonstandard number. How do we know it's possible to do this?

The obvious solution would be to simply exhibit an example explicitly. Unfortunately, there is no way of doing this. One can prove the existence of such a size function, but there is no constructive proof. (The standard argument uses Zorn's lemma.) So in practice, we can't actually work with a measure of the required type, only deduce the consequences of having one.

Another subtlety is that the real numbers are well known to be the unique complete ordered field. But the nonstandard reals include the reals, and I said above that theorems about the reals transfer to theorems about the nonstandard reals. So what's going on here?

In fact, the situation is similar to that of the integers in Peano arithmetic: the inductive axiom in its usual form makes the integers unique, but when one restricts to sets that can be finitely described, nonstandard models of first order Peano arithmetic exist. This time, we note that the transfer principle only allows us to talk about sets which do not make use of the notion of standard, and again this permits the existence of nonstandard models. The sets which we can talk about are called internal, and the others are external.

So for example, we can't ask for the least upper bound of the set of all infinitesimals, since the set of all infinitesimals requires us to use the notion of standard to define it, and so is an external set, not an internal one. Thus the nonstandard reals are not complete, in the sense that there are bounded sets without a least upper bound. However, any bounded internal set has a least upper bound.

Something I didn't mention up in the section on integration is that infinite integers are really, really big. If \(N\) is an infinite integer, then there are uncountably many nonstandard integers less than \(N\). The reason is that if \(\alpha\) is any real number in \((0,1)\) then there is a nonstandard integer \(M \lt N\) such that \(M/N \approx \alpha\); but then \((M+n)/N \approx \alpha\) for any \(n \in \mathbb{N}\), so there are infinitely many nonstandard integers less than \(N\) for every real number in \((0,1)\). It's not easy to see what is going on here.

Further reading

If you want to fill in some of these details (or rather, see them filled in for you) Keisler's Foundations of Infinitesimal Calculus provides at least as much detail as you might want, and also shows some of what can be done with it. It was written to accompany the same author's Elementary Calculus: an infinitesimal approach, which presents an undergraduate course on calculus and analysis based on nonstandard analysis.

As you may have guessed from the fact that I went to the effort of writing this, I think nonstandard analysis is worth knowing about. Not everybody agrees. For a look at the criticisms and responses, you can start with the wiki page, which will also point you to other approaches to infinitesimals.

2 comments:

  1. My group has started using NSA to revisit some issues at the foundations of statistics. Here is the first result: https://arxiv.org/abs/1612.09305

    ReplyDelete
  2. I wish I knew enough about statistics to appreciate that better; but I'll settle for being pleased to see nonstandard analysis being used fruitfully!

    ReplyDelete