19. How Lisp Objects Are Represented in C

Lisp objects are represented in C using a 32-bit or 64-bit machine word (depending on the processor). The representation stuffs a pointer together with a tag, as follows:

 [ 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 ]
 [ 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 ]

   <---------------------------------------------------------> <->
            a pointer to a structure, or a fixnum              tag

A tag of 00 is used for all pointer object types, a tag of 10 is used for characters, and the other two tags 01 and 11 are joined together to form the fixnum object type. This representation gives us 31 bit fixnums and 30 bit characters, while pointers are represented directly without any bit masking or shifting. This representation, though, assumes that pointers to structs are always aligned to multiples of 4, so the lower 2 bits are always zero.

Lisp objects use the typedef Lisp_Object, but the actual C type used for the Lisp object can vary. It can be either a simple type (generally long) or a structure whose fields are bit fields that line up properly (actually, a union of structures is used). Generally the simple integral type is preferable because it ensures that the compiler will actually use a machine word to represent the object (some compilers will use more general and less efficient code for unions and structs even if they can fit in a machine word). The union type, however, has the advantage of stricter type checking. If you accidentally pass an integer where a Lisp object is desired, you get a compile error. The choice of which type to use is determined by the preprocessor constant USE_UNION_TYPE which is defined via the --use-union-type option to configure.

Various macros are used to convert between Lisp_Objects and the corresponding C type. Macros of the form XFIXNUM(), XCHAR(), XSTRING(), XSYMBOL(), do any required bit shifting and/or masking and cast it to the appropriate type. XFIXNUM() needs to be a bit tricky so that negative numbers are properly sign-extended. Since fixnums are stored left-shifted, if the right-shift operator does an arithmetic shift (i.e. it leaves the most-significant bit as-is rather than shifting in a zero, so that it mimics a divide-by-two even for negative numbers) the shift to remove the tag bit is enough. This is the case on all the systems we support.

Note that when ERROR_CHECK_TYPES is defined, the converter macros become more complicated—they check the tag bits and/or the type field in the first four bytes of a record type to ensure that the object is really of the correct type. This is great for catching places where an incorrect type is being dereferenced—this typically results in a pointer being dereferenced as the wrong type of structure, with unpredictable (and sometimes not easily traceable) results.

There are similar XSETTYPE() macros that construct a Lisp object. These macros are of the form XSETTYPE (lvalue, result), i.e. they have to be a statement rather than just used in an expression. The reason for this is that standard C doesn’t let you “construct” a structure (but GCC does). Granted, this sometimes isn’t too convenient; for the case of fixnums, at least, you can use the function make_fixnum(), which constructs and returns an integer Lisp object. Note that the XSETTYPE() macros are also affected by ERROR_CHECK_TYPES and make sure that the structure is of the right type in the case of record types, where the type is contained in the structure.

The C programmer is responsible for guaranteeing that a Lisp_Object is the correct type before using the XTYPE macros. This is especially important in the case of lists. Use XCAR and XCDR if a Lisp_Object is certainly a cons cell, else use Fcar() and Fcdr(). Trust other C code, but not Lisp code. On the other hand, if XEmacs has an internal logic error, it’s better to crash immediately, so sprinkle assert()s and “unreachable” abort()s liberally about the source code. Where performance is an issue, use type_checking_assert, bufpos_checking_assert, gc_checking_assert, and the like, which do nothing unless the corresponding configure error checking flag was specified.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Aidan Kehoe on December 27, 2016 using texi2html 1.82.