Rice's theorem

In computability theory, Rice's theorem states that all non-trivial semantic properties of programs are undecidable. A semantic property is one about the program's behavior (for instance, "does the program terminate for all inputs?"), unlike a syntactic property (for instance, "does the program contain an if-then-else statement?"). A non-trivial property is one which is neither true for every program, nor false for every program.

The theorem generalizes the undecidability of the halting problem. It has far-reaching implications on the feasibility of static analysis of programs. It implies that it is impossible, for example, to implement a tool that checks whether any given program is correct, or even executes without error (it is possible to implement a tool that always overestimates or always underestimates e.g. the correctness of a program, so in practice one has to decide what is less of a problem).

The theorem is named after Henry Gordon Rice, who proved it in his doctoral dissertation of 1951 at Syracuse University.

Introduction

Rice's theorem puts a theoretical bound on which types of static analysis can be performed automatically. One can distinguish between the syntax of a program, and its semantics. The syntax is the detail of how the program is written, or its "intension", and the semantics is how the program behaves when run, or its "extension". Rice's theorem asserts that it is impossible to decide a property of programs which depends only on the semantics and not on the syntax, unless the property is trivial (true of all programs, or false of all programs).

By Rice's theorem, it is impossible to write a program that automatically verifies for the absence of bugs in other programs, taking a program and a specification as input, and checking whether the program satisfies the specification.

This does not imply an impossibility to prevent certain types of bugs. For example, Rice's theorem implies that in dynamically typed programming languages which are Turing-complete, it is impossible to verify the absence of type errors. On the other hand, statically typed programming languages feature a type system which statically prevents type errors. In essence, this should be understood as a feature of the syntax (taken in a broad sense) of those languages. In order to type check a program, its source code must be inspected; the operation does not depend merely on the hypothetical semantics of the program.

In terms of general software verification, this means that although one cannot algorithmically check whether any given program satisfies a given specification, one can require programs to be annotated with extra information that proves the program is correct, or to be written in a particular restricted form that makes the verification possible, and only accept programs which are verified in this way. In the case of type safety, the former corresponds to type annotations, and the latter corresponds to type inference. Taken beyond type safety, this idea leads to correctness proofs of programs through proof annotations such as in Hoare logic.

Another way of working around Rice's theorem is to search for methods which catch many bugs, without being complete. This is the theory of abstract interpretation.

Yet another direction for verification is model checking, which can only apply to finite-state programs, not to Turing-complete languages.

Formal statement

Let φ be an admissible numbering of partial computable functions. Let P be a subset of $\mathbb {N}$ . Suppose that:

P is non-trivial: P is neither empty nor $\mathbb {N}$ itself.
P is extensional: for all integers m and n, if φ_m = φ_n, then m ∈ P ⟺ n ∈ P.

Then P is undecidable.

A more concise statement can be made in terms of index sets: The only decidable index sets are ∅ and $\mathbb {N}$ .

Examples

Given a program P which takes a natural number n and returns a natural number P(n), the following questions are undecidable:

Does P terminate on a given n? (This is the halting problem.)
Does P terminate on 0?
Does P terminate on all n (i.e., is P total)?
Does P terminate and return 0 on every input?
Does P terminate and return 0 on some input?
Does P terminate and return the same value for all inputs?
Is P equivalent to a given program Q?

Proof by Kleene's recursion theorem

Assume for contradiction that $P$ is a non-trivial, extensional and computable set of natural numbers. There is a natural number $a\in P$ and a natural number $b\notin P$ . Define a function $Q$ by $Q_{e}(x)=\varphi _{b}(x)$ when $e\in P$ and $Q_{e}(x)=\varphi _{a}(x)$ when $e\notin P$ . By Kleene's recursion theorem, there exists $e$ such that $\varphi _{e}=Q_{e}$ . Then, if $e\in P$ , we have $\varphi _{e}=\varphi _{b}$ , contradicting the extensionality of $P$ since $b\notin P$ , and conversely, if $e\notin P$ , we have $\varphi _{e}=\varphi _{a}$ , which again contradicts extensionality since $a\in P$ .

Proof by reduction from the halting problem

Proof sketch

Suppose, for concreteness, that we have an algorithm for examining a program p and determining infallibly whether p is an implementation of the squaring function, which takes an integer d and returns d². The proof works just as well if we have an algorithm for deciding any other non-trivial property of program behavior (i.e. a semantic and non-trivial property), and is given in general below.

The claim is that we can convert our algorithm for identifying squaring programs into one that identifies functions that halt. We will describe an algorithm that takes inputs a and i and determines whether program a halts when given input i.

The algorithm for deciding this is conceptually simple: it constructs (the description of) a new program t taking an argument n, which (1) first executes program a on input i (both a and i being hard-coded into the definition of t), and (2) then returns the square of n. If a(i) runs forever, then t never gets to step (2), regardless of n. Then clearly, t is a function for computing squares if and only if step (1) terminates. Since we have assumed that we can infallibly identify programs for computing squares, we can determine whether t, which depends on a and i, is such a program; thus we have obtained a program that decides whether program a halts on input i. Note that our halting-decision algorithm never executes t, but only passes its description to the squaring-identification program, which by assumption always terminates; since the construction of the description of t can also be done in a way that always terminates, the halting-decision cannot fail to halt either.

 halts (a,i) {
   define t(n) {
     a(i)
     return n×n
   }
   return is_a_squaring_function(t)
 }

This method does not depend specifically on being able to recognize functions that compute squares; as long as some program can do what we are trying to recognize, we can add a call to a to obtain our t. We could have had a method for recognizing programs for computing square roots, or programs for computing the monthly payroll, or programs that halt when given the input "Abraxas"; in each case, we would be able to solve the halting problem similarly.

Formal proof

For the formal proof, algorithms are presumed to define partial functions over strings and are themselves represented by strings. The partial function computed by the algorithm represented by a string a is denoted F_a. This proof proceeds by reductio ad absurdum: we assume that there is a non-trivial property that is decided by an algorithm, and then show that it follows that we can decide the halting problem, which is not possible, and therefore a contradiction.

Let us now assume that P(a) is an algorithm that decides some non-trivial property of F_a. Without loss of generality we may assume that P(no-halt) = "no", with no-halt being the representation of an algorithm that never halts. If this is not true, then this holds for the algorithm P that computes the negation of the property P. Now, since P decides a non-trivial property, it follows that there is a string b that represents an algorithm F_b and P(b) = "yes". We can then define an algorithm H(a, i) as follows:

1. construct a string t that represents an algorithm T(j) such that

T first simulates the computation of F_a(i),
then T simulates the computation of F_b(j) and returns its result.

2. return P(t).

We can now show that H decides the halting problem:

Assume that the algorithm represented by a halts on input i. In this case F_t = F_b and, because P(b) = "yes" and the output of P(x) depends only on F_x, it follows that P(t) = "yes" and, therefore H(a, i) = "yes".
Assume that the algorithm represented by a does not halt on input i. In this case F_t = F_no-halt, i.e., the partial function that is never defined. Since P(no-halt) = "no" and the output of P(x) depends only on F_x, it follows that P(t) = "no" and, therefore H(a, i) = "no".

Since the halting problem is known to be undecidable, this is a contradiction and the assumption that there is an algorithm P(a) that decides a non-trivial property for the function represented by a must be false.

References

Hopcroft, John E.; Ullman, Jeffrey D. (1979), Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, pp. 185–192
Rice, H. G. (1953), "Classes of recursively enumerable sets and their decision problems", Transactions of the American Mathematical Society, 74 (2): 358–366, doi:10.1090/s0002-9947-1953-0053041-6, JSTOR 1990888
Rogers, Hartley Jr. (1987), Theory of Recursive Functions and Effective Computability (2nd ed.), McGraw-Hill, §14.8