Let us consider some examples of machines and programs to which we may ascribe belief and goal structures.
4.1. Thermostats
Ascribing beliefs to simple thermostats is unnecessary for the study of thermostats, because their operation can be well understood without it. However, their very simplicity makes it clearer what is involved in the ascription, and we maintain (partly as a provocation to those who regard attribution of beliefs to machines as mere intellectual sloppiness) that the ascription is legitimate.
First consider a simple thermostat that turns off the heat when the temperature is a degree above the temperature set on the thermostat, turns on the heat when the temperature is a degree below the desired temperature, and leaves the heat as is when the temperature is in the two degree range around the desired temperature. The simplest belief predicate B(s,p) ascribes belief to only three sentences: ``The room is too cold'', ``The room is too hot'', and ``The room is OK''--the beliefs being assigned to states of the thermostat in the obvious way. We ascribe to it the goal, ``The room should be ok''. When the thermostat believes the room is too cold or too hot, it sends a message saying so to the furnace. A slightly more complex belief predicate could also be used in which the thermostat has a belief about what the temperature should be and another belief about what it is. It is not clear which is better, but if we wished to consider possible errors in the thermometer, then we would ascribe beliefs about what the temperature is. We do not ascribe to it any other beliefs; it has no opinion even about whether the heat is on or off or about the weather or about who won the battle of Waterloo. Moreover, it has no introspective beliefs; i.e. it doesn't believe that it believes the room is too hot.
Let us compare the above B(s,p) with the criteria of the previous section. The belief structure is consistent (because all the beliefs are independent of one another), they arise from observation, and they result in action in accordance with the ascribed goal. There is no reasoning and only commands (which we have not included in our discussion) are communicated. Clearly assigning beliefs is of modest intellectual benefit in this case. However, if we consider the class of possible thermostats, then the ascribed belief structure has greater constancy than the mechanisms for actually measuring and representing the temperature.
The temperature control system in my house may be described as follows: Thermostats upstairs and downstairs tell the central system to turn on or shut off hot water flow to these areas. A central water-temperature thermostat tells the furnace to turn on or off thus keeping the central hot water reservoir at the right temperature. Recently it was too hot upstairs, and the question arose as to whether the upstairs thermostat mistakenly believed it was too cold upstairs or whether the furnace thermostat mistakenly believed the water was too cold. It turned out that neither mistake was made; the downstairs controller tried to turn off the flow of water but couldn't, because the valve was stuck. The plumber came once and found the trouble, and came again when a replacement valve was ordered. Since the services of plumbers are increasingly expensive, and microcomputers are increasingly cheap, one is led to design a temperature control system that would know a lot more about the thermal state of the house and its own state of health.
In the first place, while the present system couldn't turn off the flow of hot water upstairs, there is no reason to ascribe to it the knowledge that it couldn't, and a fortiori it had no ability to communicate this fact or to take it into account in controlling the system. A more advanced system would know whether the actions it attempted succeeded, and it would communicate failures and adapt to them. (We adapted to the failure by turning off the whole system until the whole house cooled off and then letting the two parts warm up together. The present system has the physical capability of doing this even if it hasn't the knowledge or the will.
While the thermostat believes ``The room is too cold'', there is no need to say that it understands the concept of ``too cold''. The internal structure of ``The room is too cold'' is a part of our language, not its.
Consider a thermostat whose wires to the furnace have been cut. Shall we still say that it knows whether the room is too cold? Since fixing the thermostat might well be aided by ascribing this knowledge, we would like to do so. Our excuse is that we are entitled to distinguish--in our language--the concept of a broken temperature control system from the concept of a certain collection of parts, i.e. to make intensional characterizations of physical objects.
4.2. Self-Reproducing Intelligent Configurations
in a Cellular Automaton World
A cellular automaton system assigns a finite automaton to each point of the plane with integer co-ordinates. The state of each automaton at time t+1 depends on its state at time t and the states of its neighbors at time t. An early use of cellular automata was by von Neumann, who found a 27 state automaton whose cells could be initialized into a self-reproducing configuration that was also a universal computer. The basic automaton in von Neumann's system had a ``resting'' state 0, and a point in state 0 whose four neighbors were also in that state would remain in state 0. The initial configurations considered had all but a finite number of cells in state 0, and, of course, this property would persist although the number of non-zero cells might grow indefinitely with time.
The self-reproducing system used the states of a long strip of non-zero cells as a ``tape'' containing instructions to a ``universal constructor'' configuration that would construct a copy of the configuration to be reproduced but with each cell in a passive state that would persist as long as its neighbors were also in passive states. After the construction phase, the tape would be copied to make the tape for the new machine, and then the new system would be set in motion by activating one of its cells. The new system would then move away from its mother, and the process would start over. The purpose of the design was to demonstrate that arbitrarily complex configurations could be self-reproducing--the complexity being assured by also requiring that they be universal computers.
Since von Neumann's time, simpler basic cells admitting self-reproducing universal computers have been discovered. The simplest so far is the two state Life automaton of John Conway (Gosper 1976), and in rather full detail, (Poundstone 1984). The state of a cell at time t+1 is determined by its state at time t and the states of its eight neighbors at time t. Namely, a point whose state is 0 will change to state 1 if exactly three of its neighbors are in state 1. A point whose state is 1 will remain in state 1 if two or three of its neighbors are in state 1. In all other cases the state becomes or remains 0.
Although this was not Conway's reason for introducing them, Conway and Gosper have shown that self-reproducing universal computers could be built up as Life configurations. Poundstone (1984) gives a full description of the Life automaton inlcuding the universal computers and self-reproducing systems.
Consider a number of such self-reproducing universal computers operating in the Life plane, and suppose that they have been programmed to study the properties of their world and to communicate among themselves about it and pursue various goals co-operatively and competitively. Call these configurations Life robots. In some respects their intellectual and scientific problems will be like ours, but in one major respect they live in a simpler world than ours seems to be. Namely, the fundamental physics of their world is that of the life automaton, and there is no obstacle to each robot knowing this physics, and being able to simulate the evolution of a life configuration given the initial state. Moreover, if the initial state of the robot world is finite it can have been recorded in each robot in the beginning or else recorded on a strip of cells that the robots can read. (The infinite regress of having to describe the description is avoided by providing that the description is not separately described, but can be read both as a description of the world and as a description of itself.)
Since these robots know the initial state of their world and its laws of motion, they can simulate as much of its history as they want, assuming that each can grow into unoccupied space so as to have memory to store the states of the world being simulated. This simulation is necessarily slower than real time, so they can never catch up with the present--let alone predict the future. This is obvious if the simulation is carried out straightforwardly by updating a list of currently active cells in the simulated world according to the Life rule, but it also applies to any clever mathematical method that might predict millions of steps ahead so long as it is supposed to be applicable to all Life configurations. (Some Life configurations, e.g. static ones or ones containing single gliders or cannon can have their distant futures predicted with little computing.) Namely, if there were an algorithm for such prediction, a robot could be made that would predict its own future and then disobey the prediction. The detailed proof would be analogous to the proof of unsolvability of the halting problem for Turing machines.
Now we come to the point of this long disquisition. Suppose we wish to program a robot to be successful in the Life world in competition or co-operation with the others. Without any idea of how to give a mathematical proof, I will claim that our robot will need programs that ascribe purposes and beliefs to its fellow robots and predict how they will react to its own actions by assuming that they will act in ways that they believe will achieve their goals. Our robot might acquire these mental theories in several ways: First, we might design the universal machine so that they are present in the initial configuration of the world. Second, we might program it to acquire these ideas by induction from its experience and even transmit them to others through an ``educational system''. Third, it might derive the psychological laws from the fundamental physics of the world and its knowledge of the initial configuration. Finally, it might discover how robots are built from Life cells by doing experimental ``biology''.
Knowing the Life physics without some information about the initial configuration is insufficient to derive the psychological laws, because robots can be constructed in the Life world in an infinity of ways. This follows from the ``folk theorem'' that the Life automaton is universal in the sense that any cellular automaton can be constructed by taking sufficiently large squares of Life cells as the basic cell of the other automaton .
Men are in a more difficult intellectual position than Life robots. We don't know the fundamental physics of our world, and we can't even be sure that its fundamental physics is describable in finite terms. Even if we knew the physical laws, they seem to preclude precise knowledge of an initial state and precise calculation of its future both for quantum mechanical reasons and because the continuous functions needed to represent fields seem to involve an infinite amount of information.
This example suggests that much of human mental structure is not an accident of evolution or even of the physics of our world, but is required for successful problem solving behavior and must be designed into or evolved by any system that exhibits such behavior.
4.3. Computer Time-Sharing Systems
These complicated computer programs allocate computer time and other resources among users. They allow each user of the computer to behave as though he had a computer of his own, but also allow them to share files of data and programs and to communicate with each other. They are often used for many years with continual small changes, and the people making the changes and correcting errors are often different from the original authors of the system. A person confronted with the task of correcting a malfunction or making a change in a time-sharing system often can conveniently use a mentalistic model of the system.
Thus suppose a user complains that the system will not run his program. Perhaps the system believes that he doesn't want to run, perhaps it persistently believes that he has just run, perhaps it believes that his quota of computer resources is exhausted, or perhaps it believes that his program requires a resource that is unavailable. Testing these hypotheses can often be done with surprisingly little understanding of the internal workings of the program.
4.4. Programs Designed to Reason
Suppose we explicitly design a program to represent information by sentences in a certain language stored in the memory of the computer and decide what to do by making inferences, and doing what it concludes will advance its goals. Naturally, we would hope that our previous second order definition of belief will ``approve of'' a B(s,p) that ascribed to the program believing the sentences explicitly built in. We would be somewhat embarassed if someone were to show that our second order definition approved as well or better of an entirely different set of beliefs.
Such a program was first proposed in (McCarthy 1959), and here is how it might work:
Information about the world is stored in a wide variety of data structures. For example, a visual scene received by a TV camera may be represented by a array of numbers representing the intensities of three colors at the points of the visual field. At another level, the same scene may be represented by a list of regions, and at a further level there may be a list of physical objects and their parts together with other information about these objects obtained from non-visual sources. Moreover, information about how to solve various kinds of problems may be represented by programs in some programming language.
However, all the above representations are subordinate to a collection of sentences in a suitable first order language that includes set theory. By subordinate, we mean that there are sentences that tell what the data structures represent and what the programs do. New sentences can arise by a variety of processes: inference from sentences already present, by computation from the data structures representing observations, and by interpreting certain inputs as communications in a one or more languages.
The construction of such a program is one of the major approaches to achieving high level artificial intelligence, and, like every other approach, it faces numerous obstacles. These obstacles can be divided into two classes--epistemological and heuristic. The epistemological problem is to determine what information about the world is to be represented in the sentences and other data structures, and the heuristic problem is to decide how the information can be used effectively to solve problems. Naturally, the problems interact, but the epistemological problem is more basic and also more relevant to our present concerns. We could regard it as solved if we knew how to express the information needed for intelligent behavior so that the solution to problems logically followed from the data. The heuristic problem of actually obtaining the solutions would remain.
The information to be represented can be roughly divided into general information about the world and information about particular situations. The formalism used to represent information about the world must be epistemologically adequate, i.e. it must be capable of representing the information that is actually available to the program from its sensory apparatus or can be deduced. Thus it couldn't handle available information about a cup of hot coffee if its only way of representing information about fluids was in terms of the positions and velocities of the molecules. Even the hydrodynamicist's Eulerian distributions of density, velocity, temperature and pressure would be useless for representing the information actually obtainable from a television camera. These considerations are further discussed in (McCarthy and Hayes 1969).
Here are some of the kinds of general information that will have to be represented:
4.4.1. Narrative. Events occur in space and time. Some events are extended in time. Partial information must be expressed about what events begin or end during, before and after others. Partial information about places and their spacial relations must be expressible. Sometimes dynamic information such as velocities are better known than the space-time facts in terms of which they are defined.
4.4.2. Partial information about causal systems. Quantities have values and later have different values. Causal laws relate these values.
4.4.3. Some changes are results of actions by the program and other actors. Information about the effects of actions can be used to determine what goals can be achieved in given circumstances.
4.4.4 Objects and substances have locations in space. It may be that temporal and causal facts are prior to spatial facts in the formalism.
4.4.5. Some objects are actors with beliefs, purposes and intentions.
Of course, the above English description is no substitute for an axiomatized formalism, not even for philosophy, but a fortiori when computer programs must be written. The main difficulties in designing such a formalism involve deciding how to express partial information. (McCarthy and Hayes 1969) uses a notion of situation wherein the situation is never known--only facts about situations are known. Unfortunately, the formalism is not suitable for expressing what might be known when events are taking place in parallel with unknown temporal relations. It also only treats the case in which the result of an action is a definite new situation and therefore isn't suitable for describing continuous processes.