% This preamble comes from /u/jmc/let/latex-95.tex which is copied into % latex sources by the macro latex-include-95 which is itself called % by the macro mklatex-95 executed in /u/jmc/let/files \documentclass[12pt]{article} % BEGIN stuff for times and dates \newcount\hours \newcount\minutes \newcount\temp \newtoks\ampm % set the real time of day \def\setdaytime{% \temp\time \divide\temp 60 \hours\temp \multiply\temp -60 \minutes\time \advance\minutes \temp \ifnum\hours =12 \ampm={p.m.} \else \ifnum\hours >12 \advance\hours -12 \ampm={p.m.} \else \ampm={a.m.} \fi \fi } \setdaytime \def\theMonth{\relax \ifcase\month\or Jan\or Feb\or Mar\or Apr\or May\or Jun\or Jul\or Aug\or Sep\or Oct\or Nov\or Dec\fi} \def\theTime{\the\hours:\ifnum \minutes < 10 0\fi \the\minutes\ \the\ampm} \def\jmcdate{ \number\year\space\theMonth\space\number\day} % END stuff for times and dates %begin stuff for version to be read on-line %output-xwindow %\textheight 8.0in %output-xwindow %\textwidth 6.5in %output-powerbook %\textheight 4.1in % This height setting is for reading on the %%Powerbook. %output-xwindow %\textheight 7.5in % This is for reading on an X-window. %output-xwindow %\textwidth 5.5in %output-xwindow %\oddsidemargin 0.0in %output-xwindow %\topmargin -0.5in %output-xwindow %\headheight 0.0in % end stuff for version to be read on-line, but note % that the whole text up to the bibliography % is bracketed to be printed large. %****************************** %****************************** %****************************** %****************************** \begin{document} \bibliographystyle{alpha} \title{A LOGICAL AI APPROACH TO CONTEXT} \author{ {\Large\bf John McCarthy} \\ Computer Science Department \\ Stanford University \\ Stanford, CA 94305 \\ {\tt jmc@cs.stanford.edu} \\ {\tt http://www-formal.stanford.edu/jmc/}} \date{\jmcdate ,\ \theTime} \maketitle %For reading on-line %output-xwindow % {\Large % On-line requires \Large % Van Benthem wants /u/jmc/RMAIL.F95==120 \begin{abstract} Logical AI develops computer programs that represent what they know about the world primarily by logical formulas and decide what to do primarily by logical reasoning---including nonmonotonic logical reasoning. It is convenient to use logical sentences and terms whose meaning depends on context. The reasons for this are similar to what causes human language to use context dependent meanings. This note gives elements of some of the formalisms to which we have been led. Fuller treatments are in \cite{McC93}, \cite{guha-thesis} and \cite{McCBuvac94} and the references cited in the Web page \cite{Buvac95}. The first main idea is to make contexts first class objects in the logic and use the formula $ist(c,p)$ to assert that the \emph{proposition} $p$ is true in the \emph{context} $c$. A second idea is to formalize how propositions true in one context transform when they are moved to different but related contexts. An ability to \emph{transcend} the outermost context is needed to give computer programs the ability to reason about the totality of all they have thought about so far \cite{McC96}. \end{abstract} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Introduction} As requested by Johan van Benthem, this is a brief introduction to the logical formalism for context being explored by John McCarthy and Sa\v{s}a Buva\v{c} at Stanford University. It is motivated by the need to use contexts as first order objects for artificial intelligence. I hope the description is suitable for comparison with other approaches to context that often have other motivations. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Features of the Formalism} Here are some features of our formalizations. % begin here \begin{enumerate} \item We offer no \emph{definition} of context. There are mathematical context structures of different properties, some of which are useful. Asking what a context is is like asking what a group element is. See section \ref{sec:abstract} for more on this. \item Sentences about propositions and contexts are built up from a formula $ist(c,p)$ which is to be understood as asserting that the {\em proposition} $p$ is true in the {\em context} $c$. When we have entered the context $c$, we can write \begin{equation} c:\quad\quad p. \end{equation} \item Once a program has inferred a sentence $q$ from $p$, it can \emph{leave} the context $c$ and have $ist(c,q)$. This generalizes natural deduction. \item Reasoning and communicating in context permits taking only limited phenomena into account. Treating contexts as objects permits stating the limitations explicitly within the formalism. \item Statements about contexts are themselves in contexts. \item There is no universal context. This is a fact of epistemology (both of the physical world and the mathematical world). It is always possible to generalize the concepts one has used up to the present. Attempts at ultimate definitions always fail---and usually in uninteresting ways. Humans and machines must start at middle levels of the conceptual world and both specialize and generalize. \item We can deal with this phenomenon in our formalism by ensuring that it is always possible to \emph{transcend} the outermost context used so far. Thus a robot designed in this way is not stuck with the concepts it has been given. \item Because of the possibility of transcendence, the use of contexts as objects is not just a matter of efficiency. Any given set of sentences including contexts can always be \emph{flattened} (at the cost of lengthening) to eliminate explicit contexts. However, the resulting flat theory can no longer be transcended within the formalism, because it is not an object that can be referred to as a whole. \item There is often a theory associated with a context---the set of sentences true in the context. However, two contexts with the same theory need not be the same, because they may have different relations with other contexts. Not all useful contexts will be closed under logical inference. \item We advocate using \emph{propositions} as discussed in \cite{McC79b} for the objects true in contexts rather than logical or natural language sentences. This has the advantage that the set of propositions true in a context may be finite when the set of sentences that can express these propositions will be infinite. However, our present applications of context would work equally well if sentences were used. Buva\v{c} and Mason \cite{buvac-buvac-mason-95} treat $ist(c,p)$ as a modal logic formula in a propositional theory. \item Besides the truth of propositions in contexts, we consider the value $value(c,exp)$ of a term $exp$ representing an \emph{individual concept} in a context $c$ as discussed in \cite{McC79b}. This presents problems beyond those presented by propositions, because in general the space of values of individual concepts will depend on some outer context. \end{enumerate} \section{Applications} Here are some applications of the logical theory of contexts. \begin{enumerate} \item Conventional linguistic applications like the referents of pronouns can be treated using contexts as objects, but formalized contexts are also useful for more complex anaphora. For example, we need to relate the surgeon's ``Scalpel'' to the sentence ``Please hand me a number 3 scalpel''. See \cite{Buvac96}. These applications require associating contexts with sentences or parts of sentences. \item Defining a theory in a narrow context in a way that permits it to be \emph{lifted} to a richer outer context and applied. \cite{McC93} discusses lifting a simple theory of $above(x,y)$ as the transitive closure of $on(x,y)$ to an outer situation calculus context that uses $on(x,y,s)$ and $above(x,y,s)$. A key formula of that paper is \begin{equation} c:\hspace{0.3in} (\forall x y s)(on(x,y,s) \equiv ist(context\hbox{-}of\hbox{-}situation(s),on(x,y))),\label{on1} \end{equation} which relates the three argument situation calculus predicate $on(x,y,s)$ and the two element predicate $on(x,y)$ of the specialized theory of $on$ and $above$. The use of contexts to implement ``microtheories'' in Cyc is described in \cite{guha-thesis}. This allowed people entering knowledge about some phenomenon, e.g. automobiles, to do it in a limited context, but leave open the ability to use the knowledge in a larger context. \item Defining a narrow context for a problem and importing facts that permit the problem to be solved by considering only a small set of possibilities. For example, in formulating the missionaries and cannibals problem a person or program must take a number of common sense facts into account, but ends up with a 32 state space, because all that is relevant in this context is the numbers of missionaries, cannibals and boats on each bank of the river. \item Relating databases with different conventions \cite{McCBuvac94}. Imagine that the Airforce and the General Electric Company have databases both of which include prices for the jet engines that the company sells the Airforce. However, suppose the databases don't agree on what the price covers, e.g. spare parts. We use one context $c_{AF}$ for the Air Force database, another $c_{GE}$ for the GE database, and a third context $c0$ that needs to relate information from both. \emph{Lifting} formulas in the context true in $c0$ relate information in the different databases to the context in which reasoning is done, , e.g. they tell about the relation of the prices listed in $c_{AF}$ and $c_{GE}$ to the inclusion or not of spare parts. \item Buva\v{c} and McCarthy have also discussed using context to combine aspects of plans generated by different planners not originally designed to work together---or plans originally intended to work together but which have drifted apart in the course of separate development. \end{enumerate} \section{Desiderata for a Mathematical Logic of Context} \label{sec:abstract} The simplest approach to a logic of context is to treat $ist(c,p)$ as a modal operator with $p$ quantifier free. Sa\v{s}a Buva\v{c} and Ian Mason \cite{buvac-buvac-mason-95} did this. However, the applications to natural language, to databases and to formalizing common sense knowledge and reasoning require a lot more. Here are some desiderata for a formal theory.\footnote{Just so Johan doesn't get off too easily in keeping his promise to make one.} \begin{itemize} \item $truths(c)$ is the set of $p$ such that $ist(c,p)$. In some formalizations it will be a first class object. In any case we can think about it in the metatheory. \item The simplest possibility for $truths(c)$ for a particular context $c$ is that it is an arbitrary set of propositions, i.e. not required to be closed under some logical operations. \item The second possibility is that $truths(c)$ is closed under deduction in some logical system---perhaps the theory of contexts. \item $truths(c)$ may be the set of propositions true about some subject matter. We can assert propositions about this set of proposition without knowing what sentences are in it. \item Associated with at least some contexts is a domain $domain(c)$. As with $truths(c)$, $domain(c)$ may be an object, presumably in a higher level context, or it may be only in the metalanguage. \end{itemize} The variety of potential applications of contexts as objects suggests looking at contexts as mathematics looks at group elements. Groups were first identified as sets of transformations closed under certain operations. However, it was noticed that the integers with addition as an operation, the non-zero rationals with multiplication as an operation and many others had the same algebraic property. This motivated the definition of abstract group around the turn of the century. In such a theory, formulas express relations among contexts would be primary rather than the propositions true in the contexts. Thus the theory would emphasize $specializes(c1,c2,time)$ rather than $ist(c,p)$. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Remarks} Johan van Benthem asked for the following in soliciting this essay and John Perry's. \begin{quotation} \emph{My proposal is the following. I would like to invite the two Johns to send me a rough outline of their contribution. It would be good if you could bring out (1) what the notion of context \textbf{is} and what it \textbf{does} according to you: in both cases, I think you want it to achieve 'efficiency' and 'portability' of information, (2) what is involved in the dynamics of \textbf{changing contexts}, perhaps with attendant changes in linguistic formulation (add or drop variables, etcetera). I would then like to comment on this, adding some thoughts on possible logical formalizations, emphasizing the interplay between what is said in a formula and what remains implicit in the models where it gets evaluated.} \end{quotation} I have rejected the idea of defining what a context \textbf{is}, but I hope I have given some idea of what they do. The example relating the three argument $on$ and the two argument $on$ should provide a basis for comments. In the formulation of the ideas, the ability to combine formulas arising in different contexts has been more important than computational efficiency. %output-xwindow %} % This is the right bracket ending \Large %\bibliography{/u/jmc/1/biblio} % Its full name is /u/jmc/1/biblio.bib \cite{McC93} and \cite{McCBuvac94} have additional references. Also Sa\v{s}a Buva\v{c} has several other papers on context on his Web page http://www-formal.stanford.edu/buvac/. \begin{thebibliography}{BBM95} \bibitem[BBM95]{buvac-buvac-mason-95} Sa\v{s}a Buva\v{c}, Vanja Buva\v{c}, and Ian~A. Mason. \newblock Metamathematics of contexts. \newblock {\em Fundamenta Informaticae}, 23(3), 1995. \bibitem[Buv95]{Buvac95} Sa\v{s}a Buva\v{c}. \newblock Sa\v{s}a buva\v{c}'s web page, 1995. \newblock http://www-formal.stanford.edu/buvac/. \bibitem[Buv96]{Buvac96} Sa\v{s}a Buva\v{c}. \newblock Resolving lexical ambiguity using a formal theory of context. \newblock In {\em Semantic Ambiguity and Underspecification}. CSLI Lecture Notes, Center for Studies in Language and Information, Stanford, CA, 1996. \bibitem[Guh91]{guha-thesis} R.~V. Guha. \newblock {\em Contexts: A Formalization and Some Applications}. \newblock PhD thesis, Stanford University, 1991. \newblock Also published as technical report STAN-CS-91-1399-Thesis, and MCC Technical Report Number ACT-CYC-423-91. \bibitem[MB94]{McCBuvac94} John McCarthy and Sa\v{s}a Buva\v{c}. \newblock {Formalizing Context (Expanded Notes)}. \newblock Technical Note STAN-CS-TN-94-13, Stanford University, 1994. \bibitem[McC79]{McC79b} John McCarthy. \newblock First order theories of individual concepts and propositions. \newblock In Donald Michie, editor, {\em Machine Intelligence}, volume~9. Edinburgh University Press, Edinburgh, 1979. \newblock Reprinted in \cite{mccarthy-book}. \bibitem[McC90]{mccarthy-book} John McCarthy. \newblock {\em Formalizing Common Sense: Papers by John McCarthy}. \newblock Ablex Publishing Corporation, 355 Chestnut Street, Norwood, NJ 07648, 1990. \bibitem[McC93]{McC93} John McCarthy. \newblock Notes on formalizing context. \newblock In {\em IJCAI-93}, 1993. \newblock Available on http://www-formal.stanford.edu/jmc/. \bibitem[McC96]{McC96} John McCarthy. \newblock Making robots conscious of their mental states. \newblock In Stephen Muggleton, editor, {\em Machine Intelligence 15}. Oxford University Press, 1996. \newblock to appear, available on http://www-formal.stanford.edu/jmc/. \end{thebibliography} \vfill {\tiny\rm\noindent /@steam.stanford.edu:/u/jmc/f95/context.tex: begun 1995 Sep 22, latexed \jmcdate\ at \theTime} \end{document}