Alethic Limits: Set Theory, Entscheidungsproblem & Turing Computability
As the great philosopher Immanuel Kant strove to demonstrate in his magnum opus Critique of Pure Reason, the application of reason comes up against its own limitations.
Below I epigrammatically unpack four ways these limitations have been instantiated in the intersecting disciplines of logic, mathematics, and computer science in intervening developments since Kant.
Unlike the Kantian antinomies — statement pairs that appear independently irrefutable yet jointly incompatible — which are, on Kant’s account, faux-undecidable (Kant resolves these ostensible contradictions within his doctrine of transcendental idealism), the limits below represent genuine limits of some operation whether it be denumerability, measure, proof, or computation. The meaning of each of these operations will be explained below.
The only way to dispute the results presented below is to deny the truth of some antecedent axiom and/or give up the law of non-contradiction: the assumption that a statement and its negation cannot be true at the same time.
Infinitely Countable & Uncountable Sets
In late 19th century, German mathematician Georg Cantor formulated two proofs that founded the discipline of set theory, changed the mathematical meaning of infinity, and whose effects continue to ripple in current developments in math and computer science.
Cantor’s proofs demonstrate that some infinite sets are countable, while others uncountable. At the very least this will mean that some infinities have unequal sizes, but more than that, we’ll find out that there’s no ceiling to how large infinities can get. This is the result inscribed in Cantor’s proof for the existence of an infinity of infinities.
The disjunction between countable and uncountable sets, as we will see, is counterintuitive. If it weren’t, the results of Cantor’s proof would be unsurprising.
But what exactly does countable mean? In order to understand Cantor’s proof, let’s start by first explaining what sets are, and gradually scaffold to comparing their cardinalities or their sizes — which is what will eventually yield the proof for infinitely many infinities.
A set is a collection of things. Its members can be concrete or abstract. We can think of a set of elephants as well as a set of numbers. There’s no presumption of order or purity. Members of a set do not have to, unless specified otherwise, be in any specific order or contain exclusively concrete or abstract members.
Sets can contain other sets.
They can also be empty.
We call an empty set the null set.
Their most obvious property, from what we’ve just said, is that sets, at least, have sizes. The technical term for sizes is cardinality. A set of five elephants and the number five has a cardinality of six. And yet some sets are infinite. Consider the set of all natural numbers, that is, the numbers that start either with zero or 1 and go on forever {0,1,2….∞}. Can we say anything meaningful about their cardinality?
What about the set of all integers ℤ{∞…-1,0,1…∞} or the set of all rational numbers ℚ{∞…3/4, 1/2,1/7,1/9,… ∞}?
Turns out you can, and the results are surprising!
Intuitively, we might think that the set of all natural numbers is smaller than the set of all integers, and the set of all integers is smaller than the set of all rational numbers since the integers contain zero, and all the positive and negative numbers, while the rationals contain all the integers as well as all infinitely repeating fractions that cannot be expressed as integers. And yet, there’s a mathematical proof that demonstrates that each of these infinities have the same cardinality, that is, are equal in size!
To construct this proof we need the bijection principle: a function that yields a one-to-one correspondence between the members of two sets such that no unpaired members are left over. Formally, there’s a bijection from a set A to a set B if and only if each element of A is paired with each element of B, and no element of B is left without a paring of an element of A.
We then ask: Is there a bijective function from ℚ to ℕ, and vice versa, or ℤ to ℕ and vice versa? Turns out there is:
𝑓 : ℤ → ℕ
𝑓(n) = { 2n − 1 if n >0 | −2n if n ≤ 0
The principle of bijection is the conjunction of two composing principles: injection and surjection.
An injective function is a function where each element of its codomain map at most one element of its domain. It is therefore possible that elements of its codomain are left unmatched. A surjection, by contrast, is a function that maps every element of its codomain to at least one element of its domain. This implies that elements in its domain need not be unique unlike elements in its codomain. A bijection, therefore, is a special case of injection and surjection in that it is both injective and surjective. If we can demonstrate a bijection we — ipso facto — demonstrate an equality between two cardinalities!
We then define countability as an injection of any set A to ℕ. Since an injection applies to all cardinalities smaller or equal to ℕ, countability then is anything smaller or equal to the natural numbers. ∴ All sets smaller or equal to ℕ are countable.
We define countability exactly as the bijection of any set S with the natural numbers ℕ.
So far so good. We then ask: Is there a bijective function from ℕ to ℝ? And it turns out there isn’t, only an injective one from ℕ to ℝ. This implies that ℝ > ℕ.
Here’s a proof (by no means the only):
Let’s imagine that we can list all of the real numbers between 0 and 1, which would be the cardinality of [0,1]. We can do this as decimal expansions of an infinite series:
Each number in the vertical list represents a unique and infinite decimal expansion of the real numbers. The list itself is infinite. Have we exhausted all the numbers [0,1]? Suppose we take the diagonal expansion in this infinite series and add 2 to each number in the list (mod 10 means that every 8 is converted to zero). Is this number contained in our initial list?
It is not. As such, the real numbers are not listable. We have proven that there’s no way to list the real numbers.
The diagonal method demonstrates that however way we try to list all the real numbers, there’s a real number not contained in the list. This shows that the real numbers are not listable.
A corollary of this proof is that the set of all sets of any infinite set is also not countable! This is because, every set is a proper subset of its power set.
For this to be true we have to accept the power set axiom, which asserts the existence of a power set for any set A. A power set is exactly the set of all subsets of A, which means that it contains A itself, and the null or empty set.[Does the power set also contain itself: this is a gambit for Russell’s paradox, which we won’t go into here].
Power set Axiom:
Given any set A, the set of all subsets of A is called the power set of A, denoted: 𝒫 (A).
The power set of a countably infinite set is uncountable!
Why is the power set of ℕ uncountable? That’s because there’s no bijection between ℕ and 𝒫 (ℕ). How do we know that? We can use the diagonalization above to construct a proof for this statement. We arrange the natural numbers horizontally and all the subsets of natural numbers vertically, with the values being 1 for a natural number contained in the subset and 0 for not. We then take the diagonal in the matrix as a subset that cannot possibly be in the bijection of these cardinalities, proving that the power set of the natural numbers has a larger cardinality than the set of natural numbers.
The inequality between the natural ℕ and real numbers ℝ is therefore a special case of a more general principle known as Cantor’s Theorem:
Cantor’s Theorem asserts that there is no greatest cardinality because for every infinite set the power set of that set has a greater cardinality! In fact, there’s an infinity of infinities, since there’s an infinite regress of power sets: 𝒫 (A), 𝒫(𝒫 (A)), 𝒫(𝒫(𝒫 (A)))…∞
Measurable & Immeasurable Sets
This one is another strange result. To understand it, we need the notion of the length of an interval, which is simple enough.
It turns out that certain sets are not measurable. First, just what do we mean by a measure? We’re all familiar with the notion of distance or the length of an interval.
For example the length of the set ([x,y]) = y−x. The interval [0,1], for example has the measure of 1–0 =1 and so on and so forth.
To demonstrate non-measurability, we need a special type of set called a Borel set and a special type of measure called a Lebesgue measure. As we’ll see, the Lebesgue measure is the only type of measure that applies to a Borel set. What is a Borel set?
A Borel set is a set generated by performing finitely many operations of countable additivity and complementation on certain length segments. Let’s define these operations:
Countable Additivity: an operation that adds finite or countably infinite sets together. Another way of phrasing it is that the operation takes as input finite or countably infinite sets and yields as output their union: {S₁, S₂, S₃…} → U{S₁, S₂, S₃…}
Complementation: an operation that takes any set S to its complement (the set of all non-S). Usually for any set S, this amounts to R−S, where R is the set real numbers. For a line segment [0,1], complementation yields R−[0,1].
In terms of probability, countable additivity gives us the union of disjoint events over finite or infinite sample spaces. Meanwhile, complementation provides us with what is left over in the sample space.
For a closed set [0,1] or an open set [0,1) (meaning we don’t count 1) or (0,1] (we don’t count zero), Borel sets include all the line segments of each of these sets, including all their unions and complements.
A Borel set satisfies all of the following conditions: it contains a) all of its line-segments, b) every complement of all of its sets, and c) the union of all countable families of sets.
For example, the set of irrational numbers can be generated by R−Q, where R stands for the real numbers and Q for the rationals. Since the rationals are countable, we can generate a countable union of all its subsets. We can also generate its complement, which is the set of irrationals. Any countably infinite or finite set and its complement are, therefore, Borel sets.
The Lebesgue measure is the only generalization of measure across all Borel sets. The Lebesgue measure must satisfy three conditions: a) the union must be between non-intersecting or disjoint sets, b) the output of the measure cannot be negative, and c) the length of segments is measured through 𝛌 ([x,y]) = y-x for all x,y ∋ R (where lambda stands for a function that outputs the Lebesgue measure).
It is important to keep in mind that the Lebesgue measure amounts to the sum of all disjoint sets.
The problem arises with the application of the Lebesgue measure beyond Borel sets. Some such sets are called Vitali sets, after the Italian mathematician Giuseppe Vitali who conceived them.
To generate Vitali sets, we separate an interval, say [0,1], into equivalence classes. All this means is that we bin members by some property such that bins are mutually exclusive and exhaustive. We’re going to bin the sets by whether their measure is a rational or irrational number.
In other words, for each real number in [0,1], there’s exactly one number that’s a member of A (Vitali set), such that r-a = a rational number, q. In this group, each set has the form of Q + r for some r in the real numbers, forming a quotient group of Q|R that intersects with [0,1]. This subset of [0,1] turns out to be uncountably infinite.
In order to assume the existence of the above subset, we need to assume the controversial axiom of choice:
Axiom of Choice: Every set of non-empty, non-overlapping sets has as choice set.
What makes the axiom of choice puzzling is that for infinitely countable sets of sets, there are sets for which we cannot satisfy a criterion for uniformly selecting a choice set from the initial infinite set. In other words, the axiom of choice asserts the existence of the choice set in lieu of such a selection criterion. While we’re epistemically barred from constructing the set in a principled way, some find it reasonable to assume that this choice set nonetheless exists.
With the axiom of choice we then generate a set A, which is a subset of [0,1], that contains a member from each equivalence class. This set turns out to have no measure. And here is why:
If the measure of each set is 0, then by countable/sigma additivity the measure of A is also 0.
If the measure of each is a positive number, then by countable/sigma additivity the measure of A is infinite.
These results further assume the uniform distribution of Lebesgue measures, known as translational invariance. That is to say, the sets in each equivalence class differ in position, but not in measure— they must all possess the same measure.
Since our interval is the countably infinite interval between 0 and 1, our measure should’ve been 1. But Vitali sets prove that there can be no countably infinite subset of a measure 1 that has a measure, because if it had a measure, that measure would have to be uniform. Without giving up countable additivity and complementation then, Vitali sets cannot have a measure.
Therefore, the axiom of choice is a necessary condition for the existence of non-measurable sets.
Decidable & Undecidable Statements
We saw with Cantor’s theorem that we can obtain set-theoretical statements that are surprising yet provable, that is, demonstrated to be axiomatically true!
Now the question arises: can all the truths of an axiomatic system be proven? If all the truths can be proven, then all the false statements can be refuted. We call this property decidability.
Decidability: a sentence or statement is decidable if and only if it or its negation is either provable or refutable (exclusive or).
By corollary, we can define its inverse, undecidability, as a sentence or statement that’s neither provable nor refutable.
If you think about it, undecidability is a rather strange property because it predicates of sentences (in an axiomatic system) that they cannot be determined to be true or false.
Now the question is, are there formally undecidable sentences, that is, is the property of undecidability instantiated?
David Hilbert called this Entscheidungsproblem or the decision problem. His hope was that all sentences of a sufficiently powerful axiomatic system could be sorted into provable and refutable, thereby setting arithmetic on an apodictic (logically certain) foundation.
Another way of formulating Hilbert’s question: is there an algorithm that can decide whether a set of axioms are universally consistent, that is provable for every mathematical structure satisfying the axioms (Rayo, 2020)?.
Again, this means that every statement derived from the axioms ought to be provable. Yet this is precisely the outcome that is ruled out by Gödel’s Incompleteness Theorems.
In order to explicate Gödel’s proof we need a system of signs from which we can express formulas and proofs. A set of formulas will be taken as primitives, called the axioms, and all subsequent formulas will be deductively derived from the axioms according to a set of inference rules. Formulas will be used to construct proofs, which consist of a set of finite formulas.
Gödel’s ingenuity involved the innovation of a function called Gödel numbering, a mapping function that assigns a unique natural number to every sign, formula and proof in the formal system. Mapping essentially means that abstract relations between one domain of objects, say geometry, are mirrored or preserved in another domain of objects, say algebra. The mapping function is crucial to Gödel’s proof, because as we will see, he uses it to express meta-mathematical statements through arithmetical statements. In other words, the system of signs and axioms with which we begin, one that is capable of expressing basic arithmetic, will be used to express meta-mathematical statements, including a very particular statement that will unravel the very notion of formal completeness. Meta-mathematical statements are statements about the system itself, such as invoking the notion of proof.
The formal system itself can be erected on different sets of primitives. But all systems will consist of three types of signs: sings that represent constants, such as the the logical operators and, not, existential and universal quantification, parentheses etc, and signs that represent variables, which come in three types: numerical, predicates, and sentential (that is, variables that stand for logical statements themselves).
Each sign will be assigned some unique Gödel number according to some rule. For example, the constants will each be assigned a positive integer, the numerical variables a prime greater than ten, the sentential variables the square of those primes associated with the numerical variables, and the predicate variables the cube of those primes.
Because each sign has a unique Gödel number, each formula of the system will likewise have a unique Gödel number. How we achieve this is a matter of convention. For example, for the ten character formula below we can assign as a unique Gödel number the first ten primes raised to the power of the Gödel number of each corresponding sign, as seen below:
(∃x) (x=sy) (Gödel number ➔ 2⁸ · 3⁴ · 5¹¹ · 7⁹ · 11⁸ · 13¹ · 17⁵ · 19⁷ · 23¹³ · 29⁹)
We can use this rule furthermore to assign a unique Gödel number to every formula. We can also use this technique to assign Gödel numbers to systems of formulas such as proofs.
We noted that the mapping will be used to formally smuggle meta-mathematical statements within the formal system. This can be illustrated by Richard’s paradox, where a number of arithmetical definitions are ordered and assigned unique integers. Some statements like ‘not divisible by any integer other than 1 and itself’, that is, the property of being a prime, may turn out to be the property of the corresponding integer, say 19. Some properties will not be mirrored by its unique integer, say if the property of being a prime corresponded with the number 8. When this is the case, we call that property Richardian. Now the property of being Richardian will have its own unique integer n, so the question will arise whether n is itself Richardian or not, leading to a paradox: n is Richardian if and only if n is not Richardian.
This manner of constructing the paradox is, however, improper because the property of being Richardian is not an arithmetical statement but requires reference to the system of signs in which the arithmetical statements are expressed, in this case Arabic numerals and English. This level-scrambling between purely arithmetical statements and statements about those arithmetical statements is what Gödel achieved without committing the fallacy embodied in Richard’s paradox.
Gödel circumvents the informality of Richard’s paradox through the mapping function we called Gödel numbering that assigns a unique integer to every expression in the calculus. A property of Gödel numbering is that it retroactively allows us to verify it as such and furthermore reconstruct the expression or system of expressions that it uniquely names. A Gödel number, therefore, is a scrutable address of the larger expression/s and its constituting primitives.
Perhaps, then, you can see where this is going. Gödel numbering will enable the arithmetization of metamathematics without violating the calculus. Specifically, it will allow us to arithmetize self-reference. It will do this by incorporating corresponding Gödel numbers and their arithmetical relations as expressions about the expressions.
Let’s look at the example below:
(∃x)(x=sy) has Gödel number m — m just being a way of substituting a symbol for the real Gödel number for simplicity’s sake.
Then, what if the Gödel number that designates the above formula is substituted for y within the very formula:
(∃x)(x=sm)
Which states: there is a number x, such that the immediate successor of x is m.
Now while this substitution designates its own unique Gödel number, meta-mathematically we can express this Gödel number as: the Gödel number obtained from the formula with Gödel number m by substituting for the variable with Gödel number 13 (i.e. the designation of y) the numeral for m.
Even though this is not expressed directly in the statement, the substitution has expressed something metamathematical by drawing a distinction between a numeral and a number.
Now that we have, more or less, seen how Gödel realized metamathematical statements through Gödel numbering, let’s see how he constructed undecidable statements, namely statements that are neither provable nor refutable.
(x) ~Dem(x, z) reads as: ‘for every x, the sequence of formulas with Gödel number x is not a proof for the formula with Gödel number z’
This formula expresses the non-demonstrability of z.
Next we replace sub(y, 13, y) for z: (x) ~Dem(x, sub(y, 13, y)), where sub(y, 13, y) designates the Gödel number of the formula obtained from the formula with Gödel number y, by substituting the variable with Gödel number 13 for the numeral for y.
As such, we now metamathematically assert the non-demonstrability of the formula with Gödel number sub(y, 13, y). This metamathematical formula has a Gödel number, that we shall designate as θ.
Now if we substitute θ for 13, we obtain a formula that we will call G:
G: (x) ~Dem(x, sub(θ, 13, θ))
which asserts its own non-demonstrability. The very formula is designated by the Gödel number sub(θ, 13, θ), which the formula states is non-demonstrable.
It follows from this that G is demonstrable, if and only if, G is not demonstrable, leading to a reductio ad absurdum.
Deriving a formula and its negation implies the inconsistency of a formal system. Vice versa, if consistent, then G proves to be undecidable: neither provable nor refutable.
While G is undecidable, it can be established as true outside of the system: namely that it expresses a numerical property through Gödel numbering true of all integers. Truth and undecidability constitute formal incompleteness.
Computability & Uncomputability
We saw that the decision problem is inherently unsolvable because Godel’s Incompleteness Theorems demonstrate that formal systems cannot be both complete (all expressions are provable) and consistent (not both statements and their negations are provable) given a certain threshold of expressibility.
That sufficiency threshold is an axiomatic system capable of expressing arithmetic. Namely, not every true statement of arithmetic will be provable within such a system, or, in other words, derivable from the axioms alone.
Godel’s incompleteness theorems have consequences for computation. But what is computation?
Turns out this is a complicated question and people disagree about the precise answer (for wider applications of the notion see natural computation and pancomputationalism). But there’s wide consensus on the narrow definition: a series of well-defined procedures, such as arithmetical and logical operations, that can be implemented mechanistically.
You see where this is going: some functions will turn out to be non-computable.
Specifically, they are Turing Machine non-computable. To understand this result, first let’s define a Turing machine (we’ll see that a Turing Machine can in principle implement any algorithm — except, of course, non-computable ones).
A Turing Machine is a simple type of machine that computes a function from the natural numbers to the natural numbers. In other words, 𝑓 computes as output 𝑓(𝐧) for every 𝐧 input, where 𝐧 is a member of the natural numbers (𝐧 ∈ ℕ).
A Turing Machine starts at zero and takes an input any sequence of 𝐧-s for a 𝐧>0, and halts only if it yields as output 𝑓(𝐧), that is a sequence of 𝑓(𝐧)-s.
To make this clearer it helps to visualize the hardware as a tape divided into cells, and in principle, the tape can be infinite. Furthermore, the tape has a reader that can do four distinct things: read current symbol on cell, write a new symbol, move to the left or to the right.
These components comprise a machine that can implement a procedure of four command lines on every input starting from 0 (consider this the software). Those command lines are its current state, current symbol, direction left or right, new symbol, and new state. It performs the following procedure: if in current state and current symbol and reads current symbol, replace it with new symbol and move in a direction to go to new state. Repeat this procedure indefinitely.
To simplify this, think of the Turing Machine as comprising of a finite set of states Q, a finite set of symbols S, an initial state s, where (s ∈ Q), and a transition function δ (represented by delta). The Turing Machine then computes the following transition function starting from its current state:
δ : (Q × S) → (S ×{L,R} ×Q)
where {L,R} are the possible states of its direction. This states that, for every state and symbol as input, it computes a new symbol, direction, and state as output.
In other words, a Turing Machine is an abstract machine that can simulate any algorithm or program that a physically and logically possible computer can. Which is to say that it can carry out functions that may be not be physically possible (because the matter and energy in the universe may be finite), but are logically possible assuming infinite computational power. The purpose of the Turing Machine is to delimit an abstract formalism that implements computation universally, and therefore captures all formally conceivable computations.
This is also known as the Fundamental Theorem of Turing Machines, or equivalently, the Church-Turing Thesis, which says that a function is Turing-computable if there’s a finite way of specifying the output values or computing it algorithmically. Are there limits to Turing-computability?
One way to demonstrate the computational limits of Universal Turing Machines is through the halting problem.
Let’s say we have a program represented by H(a,b), where a represents our program listing and b our input. If with input b our program halts, then H prints yes. If with input b, our program doesn’t halt, H prints no.
We then run H(a,b) as a subroutine of a program P that stipulates the following:
Now if P(x) is given to itself as input: P(p), then we get the following contradiction:
P halts if and only if it loops forever, and loops forever if and only if it halts.
The halting problem is the equivalent of the implementation of a reductio ad absurdum or a contradiction in a physical system. Namely, the logically impossible cannot be implemented by a Universal Turing Machine. It is a limit broader than any physical limit, it is a formal limit. This fundamentally undercuts Hilbert’s program that all functions are computable, or equivalently, that all decision problems are computationally decidable.
While Gödel incompleteness and uncomputability share similarities, it is worth distinguishing that the former concerns decidability of statements within an axiomatic system, whereas the latter concerns the computability of functions in the absolute sense.