% This preamble comes from /u/jmc/let/latex-95.tex which is copied into
% latex sources by the macro latex-include-95 which is itself called
% by the macro mklatex-95 executed in /u/jmc/let/files

\documentclass[12pt]{article}

% BEGIN stuff for times and dates
\newcount\hours
\newcount\minutes
\newcount\temp
\newtoks\ampm

% set the real time of day

\def\setdaytime{%
   \temp\time 
   \divide\temp 60
   \hours\temp 
   \multiply\temp -60
   \minutes\time 
   \advance\minutes \temp 
   \ifnum\hours =12 \ampm={p.m.} \else
   \ifnum\hours >12 \advance\hours -12 \ampm={p.m.} \else \ampm={a.m.} \fi \fi
}

\setdaytime

\def\theMonth{\relax
  \ifcase\month\or
    Jan\or Feb\or Mar\or Apr\or May\or Jun\or
    Jul\or Aug\or Sep\or Oct\or Nov\or Dec\fi}

\def\theTime{\the\hours:\ifnum \minutes < 10 0\fi \the\minutes\ \the\ampm}

\def\jmcdate{ \number\year\space\theMonth\space\number\day}
% END stuff for times and dates

%begin stuff for version to be read on-line

%output-xwindow
%\textheight 8.0in
%output-xwindow
%\textwidth 6.5in
%output-powerbook
%\textheight 4.1in  % This height setting is for reading on the
%%Powerbook.
%output-xwindow
%\textheight 7.5in   % This is for reading on an X-window.
%output-xwindow
%\textwidth 5.5in
%output-xwindow
%\oddsidemargin 0.0in
%output-xwindow
%\topmargin -0.5in
%output-xwindow
%\headheight 0.0in


% end stuff for version to be read on-line, but note
% that the whole text up to the bibliography
%  is bracketed to be printed large.
%******************************
%******************************

%******************************
%******************************

\begin{document}
\bibliographystyle{alpha}

\title{A LOGICAL AI APPROACH TO CONTEXT}
\author{
{\Large\bf John McCarthy} \\
Computer Science Department \\
Stanford University \\
Stanford, CA 94305 \\ 
{\tt jmc@cs.stanford.edu} \\
{\tt http://www-formal.stanford.edu/jmc/}}
\date{\jmcdate ,\ \theTime}
\maketitle
%For reading on-line
%output-xwindow
% {\Large % On-line requires \Large
% Van Benthem wants /u/jmc/RMAIL.F95==120

\begin{abstract}
  Logical AI develops computer programs that represent what they know
  about the world primarily by logical formulas and decide what to do
  primarily by logical reasoning---including nonmonotonic logical
  reasoning.  It is convenient to use logical sentences and terms
  whose meaning depends on context.  The reasons for this are similar
  to what causes human language to use context dependent meanings.
  This note gives elements of some of the formalisms to which we have
  been led.  Fuller treatments are in \cite{McC93}, \cite{guha-thesis}
  and \cite{McCBuvac94} and the references cited in the Web page
  \cite{Buvac95}.  The first main idea is to make contexts first class
  objects in the logic and use the formula $ist(c,p)$ to assert that
  the \emph{proposition} $p$ is true in the \emph{context} $c$.  A
  second idea is to formalize how propositions true in one context
  transform when they are moved to different but related contexts.
  An ability to \emph{transcend} the outermost context is needed to
  give computer programs the ability to reason about the totality of
  all they have thought about so far \cite{McC96}.
\end{abstract}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}

As requested by Johan van Benthem, this is a brief introduction to the
logical formalism for context being explored by John McCarthy and
Sa\v{s}a Buva\v{c} at Stanford University.  It is motivated by the
need to use contexts as first order objects for artificial
intelligence.  I hope the description is suitable for comparison with
other approaches to context that often have other motivations.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Features of the Formalism}

Here are some features of our formalizations.

% begin here
\begin{enumerate}

\item We offer no \emph{definition} of context.  There are mathematical
  context structures of different properties, some of which are
  useful.  Asking what a context is is like asking what a group
  element is.  See section \ref{sec:abstract} for more on this.

\item Sentences about propositions and contexts are built up from a
  formula $ist(c,p)$ which is to be understood as asserting that the
  {\em proposition} $p$ is true in the {\em context} $c$.  When we
  have entered the context $c$, we can write
  \begin{equation}
    c:\quad\quad p.
  \end{equation}

\item
  Once a program has inferred a sentence $q$ from $p$, it can
  \emph{leave} the context $c$ and have $ist(c,q)$.  This generalizes
  natural deduction.

\item Reasoning and communicating in context permits taking only
  limited phenomena into account.  Treating contexts as objects
  permits stating the limitations explicitly within the formalism.

\item Statements about contexts are themselves in contexts.  

\item There is no universal context.  This is a fact of epistemology
  (both of the physical world and the mathematical world).  It is
  always possible to generalize the concepts one has used up to the
  present.  Attempts at ultimate definitions always fail---and usually
  in uninteresting ways.  Humans and machines must start at middle
  levels of the conceptual world and both specialize and
  generalize.

\item We can deal with this phenomenon in our formalism by
  ensuring that it is always possible to \emph{transcend} the
  outermost context used so far.  Thus a robot designed in this way is
  not stuck with the concepts it has been given.

\item Because of the possibility of transcendence, the use of contexts
  as objects is not just a matter of efficiency.  Any given set of
  sentences including contexts can always be \emph{flattened} (at the
  cost of lengthening) to eliminate explicit contexts.  However, the
  resulting flat theory can no longer be transcended within the
  formalism, because it is not an object that can be referred to as a
  whole.

\item There is often a theory associated with a context---the set of
  sentences true in the context.  However, two contexts with the same
  theory need not be the same, because they may have different
  relations with other contexts.  Not all useful contexts will be
  closed under logical inference.

\item We advocate using \emph{propositions} as discussed in
  \cite{McC79b} for the objects true in contexts rather than logical
  or natural language sentences.  This has the advantage that the set
  of propositions true in a context may be finite when the set of
  sentences that can express these propositions will be infinite.
  However, our present applications of context would work equally well
  if sentences were used.  Buva\v{c} and Mason
  \cite{buvac-buvac-mason-95} treat $ist(c,p)$ as a modal logic
  formula in a propositional theory.

\item Besides the truth of propositions in contexts, we consider the
  value $value(c,exp)$ of a term $exp$ representing an
  \emph{individual concept} in a context $c$ as discussed in
  \cite{McC79b}.  This presents problems beyond those presented by
  propositions, because in general the space of values of individual
  concepts will depend on some outer context.

\end{enumerate}


\section{Applications}


Here are some applications of the logical theory of contexts.
\begin{enumerate}

\item Conventional linguistic applications like the referents of
  pronouns can be treated using contexts as objects, but
  formalized contexts are also useful for more complex anaphora.  For
  example, we need to relate the surgeon's ``Scalpel'' to the sentence
  ``Please hand me a number 3 scalpel''.  See \cite{Buvac96}.  These
  applications require associating contexts with sentences or parts of
  sentences. 

\item Defining a theory in a narrow context in a way that permits it
  to be \emph{lifted} to a richer outer context and applied.
  \cite{McC93} discusses lifting a simple theory of $above(x,y)$ as the
  transitive closure of $on(x,y)$ to an outer situation calculus
  context that uses $on(x,y,s)$ and $above(x,y,s)$.  A key formula of
  that paper is
\begin{equation}
c:\hspace{0.3in}
 (\forall x y s)(on(x,y,s)
 \equiv ist(context\hbox{-}of\hbox{-}situation(s),on(x,y))),\label{on1}
\end{equation}
which relates the three argument situation calculus predicate
$on(x,y,s)$
and the two element predicate
$on(x,y)$ of the specialized
theory of $on$ and $above$.  The use of contexts to implement
``microtheories'' in Cyc is described in \cite{guha-thesis}.
This allowed people entering knowledge about some phenomenon,
e.g. automobiles, to do it in a limited context, but leave open
the ability to use the knowledge in a larger context.

\item Defining a narrow context for a problem and importing facts that
  permit the problem to be solved by considering only a small set of
  possibilities.  For example, in formulating the missionaries and
  cannibals problem a person or program must take a number of common
  sense facts into account, but ends up with a 32 state space, because
  all that is relevant in this context is the numbers of missionaries,
  cannibals and boats on each bank of the river.

\item Relating databases with different conventions \cite{McCBuvac94}.
  Imagine that the Airforce and the General Electric Company have
  databases both of which include prices for the jet engines that the
  company sells the Airforce.  However, suppose the databases don't
  agree on what the price covers, e.g. spare parts.  We use one
  context $c_{AF}$ for the Air Force database, another $c_{GE}$ for
  the GE database, and a third context $c0$ that needs to relate
  information from both.  \emph{Lifting} formulas in the context true
  in $c0$ relate information in the different databases to the context
  in which reasoning is done, , e.g. they tell about the relation of
  the prices listed in $c_{AF}$ and $c_{GE}$ to the inclusion or not
  of spare parts.

\item Buva\v{c} and McCarthy have also discussed using context to
  combine aspects of plans generated by different planners not
  originally designed to work together---or plans originally intended
  to work together but which have drifted apart in the course of
  separate development.

\end{enumerate}


\section{Desiderata for a Mathematical Logic of Context}
\label{sec:abstract}

The simplest approach to a logic of context is to treat $ist(c,p)$ as
a modal operator with $p$ quantifier free.  Sa\v{s}a Buva\v{c} and Ian
Mason \cite{buvac-buvac-mason-95} did this.  However, the applications to
natural language, to databases and to formalizing common sense
knowledge and reasoning require a lot more.  Here are some desiderata
for a formal theory.\footnote{Just so Johan  doesn't get off
  too easily in keeping his promise to make one.}
\begin{itemize}

\item $truths(c)$ is the set of $p$ such that $ist(c,p)$.  In some
  formalizations it will be a first class object.  In any case we can
  think about it in the metatheory.

\item The simplest possibility for $truths(c)$ for a particular
  context $c$ is that it is an arbitrary set of propositions, i.e. not
  required to be closed under some logical operations.

\item The second possibility is that $truths(c)$ is closed under
  deduction in some logical system---perhaps the theory of contexts.

\item $truths(c)$ may be the set of propositions true about some
  subject matter.  We can assert propositions about this set of
  proposition without knowing what sentences are in it.

\item Associated with at least some contexts is a domain $domain(c)$.
  As with $truths(c)$, $domain(c)$ may be an object, presumably in a
  higher level context, or it may be only in the metalanguage.


\end{itemize}


The variety of potential applications of contexts as objects suggests
looking at contexts as mathematics looks at group elements.  Groups
were first identified as sets of transformations closed under certain
operations.  However, it was noticed that the integers with addition
as an operation, the non-zero rationals with multiplication as an
operation and many others had the same algebraic property.  This
motivated the definition of abstract group around the turn of the
century.  In such a theory, formulas express relations among contexts
would be primary rather than the propositions true in the contexts.
Thus the theory would emphasize $specializes(c1,c2,time)$ rather than
$ist(c,p)$.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Remarks}


Johan van Benthem asked for the following in soliciting this essay and
John Perry's.

\begin{quotation}
  \emph{My proposal is the following. I would like to invite the two Johns
  to send me a rough outline of their contribution. It would be good
  if you could bring out (1) what the notion of context \textbf{is} and what
  it \textbf{does} according to you: in both cases, I think you want it to
  achieve 'efficiency' and 'portability' of information, (2) what is
  involved in the dynamics of \textbf{changing contexts}, perhaps with
  attendant changes in linguistic formulation (add or drop variables,
  etcetera). I would then like to comment on this, adding some
  thoughts on possible logical formalizations, emphasizing the
  interplay between what is said in a formula and what remains
  implicit in the models where it gets evaluated.}
\end{quotation}

I have rejected the idea of defining what a context \textbf{is}, but I
hope I have given some idea of what they do.  The example relating the
three argument $on$ and the two argument $on$ should provide a basis
for comments.  In the formulation of the ideas, the ability to combine
formulas arising in different contexts has been more important than
computational efficiency.

%output-xwindow
%} % This is the right bracket ending \Large 

%\bibliography{/u/jmc/1/biblio}
% Its full name is /u/jmc/1/biblio.bib
\cite{McC93} and \cite{McCBuvac94} have additional references.  Also
Sa\v{s}a Buva\v{c} has several other papers on context on his Web page
http://www-formal.stanford.edu/buvac/.

\begin{thebibliography}{BBM95}

\bibitem[BBM95]{buvac-buvac-mason-95}
Sa\v{s}a Buva\v{c}, Vanja Buva\v{c}, and Ian~A. Mason.
\newblock Metamathematics of contexts.
\newblock {\em Fundamenta Informaticae}, 23(3), 1995.

\bibitem[Buv95]{Buvac95}
Sa\v{s}a Buva\v{c}.
\newblock Sa\v{s}a buva\v{c}'s web page, 1995.
\newblock http://www-formal.stanford.edu/buvac/.

\bibitem[Buv96]{Buvac96}
Sa\v{s}a Buva\v{c}.
\newblock Resolving lexical ambiguity using a formal theory of context.
\newblock In {\em Semantic Ambiguity and Underspecification}. CSLI Lecture
  Notes, Center for Studies in Language and Information, Stanford, CA, 1996.

\bibitem[Guh91]{guha-thesis}
R.~V. Guha.
\newblock {\em Contexts: A Formalization and Some Applications}.
\newblock PhD thesis, Stanford University, 1991.
\newblock Also published as technical report STAN-CS-91-1399-Thesis, and MCC
  Technical Report Number ACT-CYC-423-91.

\bibitem[MB94]{McCBuvac94}
John McCarthy and Sa\v{s}a Buva\v{c}.
\newblock {Formalizing Context (Expanded Notes)}.
\newblock Technical Note STAN-CS-TN-94-13, Stanford University, 1994.

\bibitem[McC79]{McC79b}
John McCarthy.
\newblock First order theories of individual concepts and propositions.
\newblock In Donald Michie, editor, {\em Machine Intelligence}, volume~9.
  Edinburgh University Press, Edinburgh, 1979.
\newblock Reprinted in \cite{mccarthy-book}.

\bibitem[McC90]{mccarthy-book}
John McCarthy.
\newblock {\em Formalizing Common Sense: Papers by John McCarthy}.
\newblock Ablex Publishing Corporation, 355 Chestnut Street, Norwood, NJ 07648,
  1990.

\bibitem[McC93]{McC93}
John McCarthy.
\newblock Notes on formalizing context.
\newblock In {\em IJCAI-93}, 1993.
\newblock Available on http://www-formal.stanford.edu/jmc/.

\bibitem[McC96]{McC96}
John McCarthy.
\newblock Making robots conscious of their mental states.
\newblock In Stephen Muggleton, editor, {\em Machine Intelligence 15}. Oxford
  University Press, 1996.
\newblock to appear, available on http://www-formal.stanford.edu/jmc/.

\end{thebibliography}


\vfill

{\tiny\rm\noindent /@steam.stanford.edu:/u/jmc/f95/context.tex: begun 1995 Sep 22, latexed
\jmcdate\ at \theTime}
\end{document}