Go to the previous, next section.

Debugging ILU Programs

This document describes some of the common errors that occur with the use of ILU, and some techniques for dealing with them.

C++ static instance initialization

Our support for C++ currently depends on having the constructors for all static instances run before main() is called. If your compiler or interpreter doesn't support that, you will experience odd behavior. The C++ language does not strictly mandate that this initialization will be performed, but most compilers seem to arrange things that way. We'd like to see how many compilers do not; if your's doesn't, please send a note to ilu-bugs@parc.xerox.com telling us what the compiler is.

ILU uses the static-object-with-constructor trick to effect per-compilation-unit startup code. In certain cases you'll want to ensure that a certain compilation unit's initialization is run before another's. While C++ defines no standard way to do this, most compilers work like this: compilation units are initialized (static object construtors run) in the order in which they are given to the link-editor. We (ilu-bugs@parc.xerox.com) want to hear about any exceptions to this rule.

ILU trace debugging

ILU contains a number of trace statements that allow you to observe the progress of certain operations within the ILU kernel. To enable these, you can set the environment variable ILU_DEBUG with the command setenv ILU_DEBUG "xxx:yyy:zzz:..." where xxx, yyy, and zzz are the names of various trace classes. The classes are (as of December 1995) packet, connection, incoming, export, authentication, object, sunrpc, courier, dcerpc, call, tcp, udp, xnsspp, gc, lock, server, malloc, mainloop, iiop, error, sunrpcrm, inmem, and binding. The special class ALL will enable all trace statements: setenv ILU_DEBUG ALL. The function ilu_SetDebugLevelViaString(char *trace_classes) may also be called from an application program or debugger, to enable tracing. The argument trace_classes should be formatted as described above.

ILU_DEBUG may also be set to an unsigned integer value, where each bit set in the binary version of the number corresponds to one of the above trace classes. For a list of the various bit values, see the file `ILUHOME/include/iludebug.h'. Again, you can also enable the tracing from a program or from a debugger, by calling the routine ilu_SetDebugLevel(unsigned long trace_bits) with an unsigned integer argument.

The routine ilu_SetDebugMessageHandler allows an application to specify an alternate routine to be called when an error or debugging message is to be printed.

[ILU kernel]: void ilu_SetDebugMessageHandler (void (*handler) (char *formatSpec, va_list args))

Locking: unconstrained

Registers handler with the ILU kernel to be called whenever a debugging or error message is output via ilu_DebugPrintf, instead of the default handler, which simply prints the message to stderr, using vfprintf. Two special constant values for handler are defined, ILU_DEFAULT_DEBUG_MESSAGE_HANDLER, which will cause the default behavior to be resumed, and ILU_NIL_DEBUG_MESSAGE_HANDLER, which will cause debugging and error messages to be simply, silently, discarded.

Debugging ISL

Use of islscan

The islscan program is supplied as part of the ILU release. It runs the ISL parser against a file containing an interface, and prints a "report" on the interface to standard output. It can therefor be used to check the syntax of an interface before running any language stubbers.

The ISLDEBUG environment variable

Setting the environment variable ISLDEBUG to any value (say, "t"), before running any ILU stubber or the program islscan, will cause ILU's parser to print out its state transitions as it parses the ISL file. If you're having a serious problem finding a bug in your ISL file, this might help.

Bug Reporting and Comments

Report bugs (nah! -- couldn't be!) to the Internet address ilu-bugs.parc@xerox.com, or to the XNS address ILU-bugs:PARC:Xerox. Bug reports are more helpful with some information about the activity. General comments and suggestions can be sent to either ILU@parc.xerox.com or ILU-bugs.

Often the our first reply to a bug report is a request for a typescript that shows the bug occurring, with all trace debugging turned on. If that doesn't make it clear to us, our second reply may be a request for a stack trace, with printouts of relevant variables and data strutures. Including these things in your bug report may speed the cycle of interactions.

Use of gdb

When using ILU with C++ or C or even Common Lisp, running under the GNU debugger gdb can be helpful for finding segmentation violations and other system errors.

ILU provides a debugging trace feature which can be set from gdb with the following command:

(gdb) p ilu_SetDebugLevel(0xXXX)
ilu_SetDebugLevel:  setting debug mask from 0x0 to 0xXXX
$1 = void
(gdb) 

The value XXX is an unsigned integer as discussed in section 3. The debugger dbx should also work.

We are in the midst of installing a consistent new way of handling rutime failures into the ILU runtime kernel. This new way involves the kernel reporting the failure to its caller; the old way involves combinations of panicking, reporting to the user (not the caller) via a printed message, and fragmentary reporting to the caller. Every time a runtime failure is noted the new way, the procedure _ilu_NoteRaise in `ILUSRC/runtime/kernel/error.c' is called; this procedure thus makes a good place to set a breakpoint when debugging. Most runtime failures occur due to genuine problems; some occur during normal processing (e.g., end-of-file detection).

Error handling

Ideally, the ILU runtime would report all failures to the application, in the way most appropriate for the application's programming language. Sadly, this is not yet the case.

The ILU runtime kernel has three kinds of runtime failures:

  1. memory allocation failures from which the kernel cannot proceed;
  2. internal consistency check failures, from which the kernel cannot proceed; and
  3. internal consistency check failures, which the kernel is prepared to report to the ILU language-specific runtime veneer (which, hopefully, would in turn report the failure to the applicaiton).

The second kind is being eliminated. The first kind is being reduced, and might also be eliminated.

The application can specify how each of these three kinds of runtime failures is to be handled. The choices are:

  1. Print an explanatory message and then explicitly trigger a SEGV signal by attempting to write to protected memory. This is useful for generating core dumps for later study of the error.
  2. Print an explanatory message and then exit the program with an application-specified exit code.
  3. Print an explanatory message and then enter an endless loop, which calls sleep(3) repeatedly. This option is useful for keeping the process alive but dormant, so that a debugger can attach to it and examine its "live" state. This is the default action for all three kinds of failures.
  4. Invoke an application-supplied procedure (without printing anything first).
  5. Report the failure out of the kernel, without printing anything first (this option is available only for the third kind of failure).

An application can change the action taken on memory failures by calling ilu_SetMemFailureAction or ilu_SetMemFailureConsumer.

[ILU kernel]: void ilu_SetMemFailureAction ( int mfa )

Locking: unconstrained

Calling this tells the ILU kernel which drastic action is to be performed when ilu_must_malloc fails. -2 means to print an explanatory message on stderr and then coredump; -1 means to print an explanatory message on stderr and then loop forever in repeated calls to sleep(3); positive numbers mean to print an explanatory message on stderr and then exit(mfa). The default is -1.

[ILU kernel]: typedef void (*) (const char *file, int line) ilu_FailureConsumer

A procedure that is called when the ILU kernel can't proceed. This procedure must not return.

[ILU kernel]: void ilu_SetMemFailureConsumer ( ilu_FailureConsumer mfc )

Locking: unconstrained

An alternative to ilu_SetMemFailureAction: this causes mfc to be called when ilu_must_malloc fails.

Similarly, an application specifies how unrecoverable runtime consistency check failures are to be handled by calling ilu_SetAssertionFailureAction or ilu_SetAssertionFailConsumer, which are exactly analogous to the procedures for memory failure handling. For recoverable consistency check failures, an application can call ilu_SetCheckFailureAction or ilu_SetCheckFailureConsumer.

[ILU kernel]: void ilu_SetCheckFailureAction ( int cfa )

Locking: unconstrained

Calling this tells the runtime which action is to be performed when an internal consistency check fails. -3 means to raise an error from the kernel (without necessarily printing anything); -2 means to print an explanatory message to stderr and then coredump; -1 means to print and then loop forever; non-negative numbers mean to print and then exit(cfa); others number reserved. The default is -1.

[ILU kernel]: typedef void (*) (const char *file, int line) ilu_CheckFailureConsumer

A procedure for handling an internal consistency check failure. If this procedure returns, the consistency check failure will be raised as an error from the kernel. @end deftypevr

[ILU kernel]: void ilu_SetCheckFailureConsumer ( ilu_CheckFailureConsumer cfc )

Locking: unconstrained

An alternative to ilu_SetCheckFailureAction: this causes cfc to be called (and no printing); if cfc returns, an error will be raised from the kernel.

Decoding reportable consistency check failures

For language mappings consistent with CORBA, the third kind of failure is reported as an occurrence of the CORBA system exception internal, with a minor code that encodes the filename and line number where the consistency check occurs. The coding is this: 10,000*hash(filename, 32771) + linenum + 1,000. The directory part, if any, is stripped from the filename before hashing. To aid in decoding these minor codes, ILU includes the program decoderr, which is used like this:

% decoderr 269211234
269211234 = line 234, file $ILUSRC/runtime/kernel/call.c

If a reportable consistency check failure occurs in a file not anticipated in the construction of decoderr, you'll see something like this:

% decoderr 60612345
60612345 = line 1345 in unknown file (that hashes to 6061)

The program iluhashm can be used to hash given filenames, so you can search a set of candidates for the mysterious hash code:

% iluhashm 32771 ../cpp/foobar.cpp ../cpp/barfoo.cpp
/* Generated at Mon Dec 11 22:44:47 1995
   with modulus 32771 */
{      6061, "../cpp/foobar.cpp"},
{     13273, "../cpp/barfoo.cpp"},

Go to the previous, next section.