This is to be expected; it's analogous to the fact that the rationals contain the integers and the integers contain the naturals, in the sense that whatever arithmetic we define on the containing set works in the same way on the contained set as the arithmetic defined only on the contained set does.
We're nearly finished defining the reals, but we still should define an ordering on them, and show a few basic properties. Once we've done this, we'll be able to give them a fully abstract characterisation.
An ordering is easy, now that we have arithmetic. We say, for real numbers X and Y, that X < Y ("X is less than or equal to Y"😉 if and only if the Cauchion class defined by Y - X contains a sequence {y(n)} whose every term is at least 0. For the purposes of intuition, this ordering means exactly the same thing as the one you are familiar with does, since it is inherited directly from the ordering of the rationals.
Note that for any two different reals X and Y, any choice of rational sequence, say {x(n)} in the Cauchion class defined by X is not mutually Cauchious to any sequence (say) {y(n)} in the Cauchion class defined by Y, by the definition of Cauchion classes. Thus there exists an integer m such that d(x(n), y(n)) > 1/m for all n.
Now, suppose X < Y (and they are still not equal). Then consider the rational sequence defined by z(n) = x(n) + 1/(2m). Since {x(n)} has the Cauchion property, it's clear that {z(n)} does too, so it lies in some Cauchion class. But d(z(n), x(n)) > 1/(2m) and thus d(y(n), z(n)) + d(x(n), z(n)) > d(y(n), z(n)) + 1/(2m). On the other hand, d(y(n), z(n)) > |d(x(n), z(n)) - d(y(n), x(n))| by rearranging part 4, so d(y(n), z(n)) > 1/(2m).
Thus {z(n)} is in neither the Cauchion class of X nor Y, but it is Cauchious, so it is in some other Cauchion class Z, and X < Z < Y. This means that, for any two different real numbers, there is another real number between them -- this is exactly the property we saw of the rationals. This makes us wonder what the difference is between the reals and the rationals, in terms of the "size" of the two sets. We saw that there are reals which are not rationals, but we wonder how many of these "irrational" numbers there are. This question is answered, in a powerful way, by the following theorem:
Theorem:
There does not exist a bijection from Q to R.
Proof:
This is a classical theorem, and the classical proof I've seen, by Georg Cantor, uses a very sexy piece of reasoning called a "diagonalisation argument". Unfortunately, the classical diagonalisation argument relies on doing something equivalent to representing reals as decimals, and we don't have that machinery yet. I'm going to try and prove it from first principles, ie from what we've done thus far, using a method which is spiritually similar to the classical diagonalisation argument. There are diagonalisation arguments in other contexts, but Cantor's depends on these decimal representations which we don't know about, but sequences of rationals are intuitively similar to lists of digits, so perhaps the principle can be transported...
First, recall that we proved that Q is countable, ie that there is a bijection from N to Q. Thus we'll have proved the present theorem if we show that no bijection exists between N and R.
Suppose that there is such a bijection. This means that for each natural number n, we can associate it uniquely with a real number in some way. In other words, we can write a numbered list of reals, one element at a time. Suppose our list is X[1], X[2], X[3], X[4], .... Since this list is determined by a bijection, every
real number eventually appears on the list.
We'll build two sequences of reals (ie subsets of our list), called V[k] and W[k], as follows. First, choose your favourite two distinct natural numbers, h and j*****. Let V[0] = X[h] and W[0] = X[j]. Since X[h] and X[j] are different, we may assume that V[0] < W[0] (or vice versa, but it doesn't matter; the point is that one is strictly larger than the other). Then we define the two sequences recursively: if we know V[n] and W[n] for some n, we define V[n+1] to be X[k], where k is the smallest natural number such that X[k] is between V[n] and W[n]. We then define W[n+1] to be X[l], where l is the smallest integer such that X[l] is between V[n+1] and
W[n]. Some reflection shows that:
1. These two sequences contain only elements of our list of reals (which we assume contains all the reals)
2. For all n, V[n] < W[n] (and this inequality is strict; they are not equal)
3. For all n, V[n+1] > V[n] (and they are not equal), and W[n+1] < W[n] (and not equal). We say that V[n] increases strictly, and W[n] decreases strictly.
4. As n grows larger, W(n) - V(n) gets arbitrarily close to 0.
Now suppose that {v(k,n)} is some sequence in the Cauchion class V[n] and {w(k,n)} is a sequence in the Cauchion class W[n]. The double subscripts indicate that, for a fixed n, we've chosen a rational sequence, by letting k vary. Then for all n and k, v(k,n) < w(k,n), so we can find a rational number z(k, n) between them.
Now consider the sequence {z(n,n)}. By 4, this has the Cauchion property, so it defines a Cauchion class associated with a real number we'll call Z. However, it is not mutually Cauchious to {w(k, n)} or {v(k, n)} for any n. Thus it is not the same Cauchion class as V[n] or W[n], for any n. However, by our assumption that all the reals are in the list, there is some j such that X[j] = Z. But then, since Z lies between V[n] and W[n] for all n, the definitions of those sequences imply that X[j] will be added to one of them eventually.
This is a contradiction; thus our assumption that such a list of reals can be constructed is indefensible.
Thus there is no bijection from N to R, and so none from Q to R******.QED.
To make that formally solid would require slightly more technical machinery than I can introduce at this point, but all that's really important to take away from it is that the reals are uncountable and the rationals are countable*******.
To sum up what we've done thus far: we've constructed a field, R, containing the rationals, which is ordered and uncountable. We need to establish two more properties of the reals before we can abstract the necessary ideas to define them without recourse to everything we've built before.
First, we define the "modulus" function, which maps the reals to the set of reals which are either 0 or greater than 0. We denote it |X| for some real number X, and it's value is just X if X is positive or 0, and -X if X is less than 0.
With this in mind, we can formally define a so-called "metric space", which notion we'll use to specify some things about the reals:
Any set S, together with a "distance" function d😕xS --> R+ (R+ is the set of reals which are at least 0), is called a "metric space" if the following hold:
1. For all x,y in S, d(x,y) = d(y,x)
2. d(x,y) = 0 if and only if x = y
3. For all x,y,z in S, d(x,z) < d(x,y) + d(y,z) (less than or equal to)
This looks quite familiar; we constructed such a thing out of the rationals. In fact, the rationals with the distance function we defined on them were a metric space; we just couldn't call them one because the notion of a metric space, you'll note, depends on the real numbers, which we did not yet have.
Accordingly, we can define the reals themselves to be a metric space by stipulating that d(x,y) = |x-y| for any reals x and y. If you'll grant me that, since the rationals sit inside the reals, the modulus function we've defined on the reals works on rationals in the same way the modulus function we defined only on rationals does, then you'll see that, in some sense, the metric space formed in this way from the reals is an extension of the "metric space" we built from the rationals earlier (which we used in DEFINING the reals). How does this extension work?
In the general setting of any metric space, we can define "Cauchy sequences" and "convergent sequences" (the analogies of convergent sequences and sequences "with the Cauchion property"😉 as follows:
In a metric space S, a sequence {x(n)} is a function from N to S, ie an ordered list of some of the elements of the metric space (generally, an infinite list). Suppose you name a positive real number e (generally, a very small one), and I can name an integer N, depending on e, such that for all integers m and n larger than N, d(x(n), x(m)) < e. If this can be done no matter how small you choose e to be, we say the sequence is "Cauchy".
If there is some element x of the metric space such that, given any real e, we can find and integer N so that for any n > N, d(x(n), x) < e, then we say the sequence "converges to x".
Intuitively, these two definitions seem closely related. If we have a sequence which converges, then the definition tells us that its elements get continually closer to some specified value, and therefore to one
another: any convergent sequence must be Cauchy. The only imaginable circumstance in which the converse does
not hold is when our sequence is Cauchy, but there is no element of our metric space to be sandwiched in our sequence. This, we saw, is the case with our metric space of rationals: there are Cauchy sequences which do not converge to anything in the context of the rationals, because they "converge" to something silly like the square root of 2. However, if we step out into our metric space of real numbers, which include silly things like the square root of 2, we find that there is a Cauchy sequence of reals which converges to the square root
of 2. In fact, our construction is such that whatever Cauchy sequence of rationals we take, it will converge to some real number. Even better, if we work in the broadest context we have at the moment, and take any
Cauchy sequence of reals, we can find a real number to which it converges. A metric space in which every Cauchy sequence converges to something in the metric space is called "complete", and while the rationals are not a complete metric space, the reals, with the function d we've defined in a fairly intuitive way, are a complete metric space. In fact, we essentially built the reals by constructing the simplest complete metric space which contains the rationals, using a distance function which behaves like the one we defined on the rationals********.
The final property of the reals is simpler. Given a real number X, we can find an integer N such that |X| > 1/N, since 1/N gets as small as it likes when we increase N. Thus N|X| > 1. This property seems to be a bit of a trivial non sequitur, but it's important enough to have a name: any field with an ordering in which we can do such a thing is called "Archimedean".
At this point, we can forget about everything we've done thus far and say this:
Definition
Any set R, with two functions, + and *, from RxR --> R, such that R with these operations is a field, and an ordering of R which is Archimedean, and a distance function d obeying the metric space rules under which R is a complete metric space, is in every way formally equivalent to the real numbers we have constructed, and is equivalent to the real numbers as constructed in numerous other ways.
That's a total characterisation of the reals: if we're talking about complete ordered Archimedean fields, we may as well just talk about the real numbers. (If you want some technical messing about on this subject, there's a very old (late 2003) thread by Acolyte and I in Posers and Puzzles on the subject.)
If we want to talk about any of the numbers we've spent so much effort to construct, all we have to do is look inside the structure defined above, to Wit (tgenstein): "My propositions are elucidatory in this way: he who understands me finally recognizes them as senseless, when he has climbed out through them, on them, over them. (He must so to speak throw away the ladder, after he has climbed up on it.)"
We could go several places from here (we could, quite easily, construct the complex numbers from the reals, which would be very algebraic and have the flavour of what happened when we tried to solve x*x = 2 in the
rationals, with a slightly more detailed use of the idea of a ring; we could go in another direction, and try to build calculus sensibly with our metric space ideas; we could go all the way back to the recursion we used to build the natural numbers and expand on those ideas about recursion and their application in logic -- we've even built enough examples of axiom-systems here to motivate some talk about Goedel and Turing and decidability and computability). I'd like to expand on any part of this exposition which was unclear, or which caught your fancy and which you'd like to see taken for a tangential ride.
Notes:
*Unfortunately this is not the first time I have made this appalling joke, which you will understand soon.
**This stands for "quod erat demonstrandum", and is traditionally written at the end of proofs. I think it's very badass. However, nobody does it anymore, opting instead to use some pedestrian little square or a "//" or the phrase "we're finished" or something. I say "meh" to that!
***You may have heard of the famous "Fermat's Last Theorem". I have nothing near a good understanding of the proof or all of the technical machinery used in the proof, but it's basically a result in algebraic number theory, while we're engaging in the practice of slotting things into intellectual pigeonholes. Part of the point of this thread is to blur some of the distinctions between intellectual pigeonholes by giving a connecting thread between many of them.
****It seems we're going to build the reals in a mildly nonstandard way. The two formal constructions of the reals I've seen before are the Dedekind cut one I alluded to before, and some Russian-accented dude saying "the reals are the closure of the rationals", which fundamentally depends on knowing what the reals are. My presentation is going to be almost exactly the same as his, except without this circularity, which I didn't even notice at the time.
*****Shut up, Thales2, the natural numbers are well-ordered, so we don't need the Axiom of Choice to do this, not that it would be any trouble at all if we did.
******Thales2 should point out where the Axioim of Choice was implicitly used in this proof.
*******I.e. what I've written here contains the essential ideas of the proof, and the actual proof is just some manipulative w@nk1ng. This result will actually follow very naturally when I tell you about metric spaces, though.
********This is a special case of a more general phenomenon. If we have any metric space which is not complete, we can do something very much the same as we did here, ie we can make it sit inside a complete metric space in which the distance function of the larger complete space, when restricted to the elements of the smaller, incomplete space, is consistent with the original distance function. We call the larger space the "(topological) completion" of the smaller space. Try Googling the phrase "every metric space is isometric to a complete metric space" if you're interested. The proof of this fact is essentially just an abstract version of the construction we've done, although it doesn't depend on us being able to do any arithmetic, since many metric spaces do not have any arithmetic defined on them, and it makes no explicit mention of what distance function is being used (there are some quite weird ones; there are metric spaces whose elements are themselves functions, or arrays of dots, or any number of weird things, and the powerful ideas of metric topology give us a way to treat them equally in some sense).