[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42. Dumping


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.1 Dumping Justification

The C code of XEmacs is just a Lisp engine with a lot of built-in primitives useful for writing an editor. The editor itself is written mostly in Lisp, and represents around 100K lines of code. Loading and executing the initialization of all this code takes a bit a time (five to ten times the usual startup time of current xemacs) and requires having all the lisp source files around. Having to reload them each time the editor is started would not be acceptable.

The traditional solution to this problem is called dumping: the build process first creates the lisp engine under the name ‘temacs’, then runs it until it has finished loading and initializing all the lisp code, and eventually creates a new executable called ‘xemacs’ including both the object code in ‘temacs’ and all the contents of the memory after the initialization.

This solution, while working, has a huge problem: the creation of the new executable from the actual contents of memory is an extremely system-specific process, quite error-prone, and which interferes with a lot of system libraries (like malloc). It is even getting worse nowadays with libraries using constructors which are automatically called when the program is started (even before main()) which tend to crash when they are called multiple times, once before dumping and once after (IRIX 6.x ‘libz.so’ pulls in some C++ image libraries thru dependencies which have this problem). Writing the dumper is also one of the most difficult parts of porting XEmacs to a new operating system. Basically, ‘dumping’ is an operation that is just not officially supported on many operating systems.

The aim of the portable dumper is to solve the same problem as the system-specific dumper, that is to be able to reload quickly, using only a small number of files, the fully initialized lisp part of the editor, without any system-specific hacks.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.2 Overview

The portable dumping system has to:

  1. At dump time, write all initialized, non-quickly-rebuildable data to a file [Note: currently named ‘xemacs.dmp’, but the name will change], along with all information needed for the reloading.
  2. When starting xemacs, reload the dump file, relocate it to its new starting address if needed, and reinitialize all pointers to this data. Also, rebuild all the quickly rebuildable data.

Note: As of 21.5.18, the dump file has been moved inside of the executable, although there are still problems with this on some systems.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.3 Data descriptions

The more complex task of the dumper is to be able to write memory blocks on the heap (lisp objects, i.e. lrecords, and C-allocated memory, such as structs and arrays) to disk and reload them at a different address, updating all the pointers they include in the process. This is done by using external data descriptions that give information about the layout of the blocks in memory.

The specification of these descriptions is in lrecord.h. A description of an lrecord is an array of struct memory_description. Each of these structs include a type, an offset in the block and some optional parameters depending on the type. For instance, here is the string description:

 
static const struct memory_description string_description[] = {
  { XD_BYTECOUNT,         offsetof (Lisp_String, size) },
  { XD_OPAQUE_DATA_PTR,   offsetof (Lisp_String, data), XD_INDIRECT(0, 1) },
  { XD_LISP_OBJECT,       offsetof (Lisp_String, plist) },
  { XD_END }
};

The first line indicates a member of type Bytecount, which is used by the next, indirect directive. The second means "there is a pointer to some opaque data in the field data". The length of said data is given by the expression XD_INDIRECT(0, 1), which means "the value in the 0th line of the description (welcome to C) plus one". The third line means "there is a Lisp_Object member plist in the Lisp_String structure". XD_END then ends the description.

This gives us all the information we need to move around what is pointed to by a memory block (C or lrecord) and, by transitivity, everything that it points to. The only missing information for dumping is the size of the block. For lrecords, this is part of the lrecord_implementation, so we don’t need to duplicate it. For C blocks we use a struct sized_memory_description, which includes a size field and a pointer to an associated array of memory_description.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4 Dumping phase

Dumping is done by calling the function pdump() (in ‘dumper.c’) which is invoked from Fdump_emacs (in ‘emacs.c’). This function performs a number of tasks.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4.1 Object inventory

The first task is to build the list of the objects to dump. This includes:

We end up with one pdump_block_list_elt per object group (arrays of C structs are kept together) which includes a pointer to the first object of the group, the per-object size and the count of objects in the group, along with some other information which is initialized later.

These entries are linked together in pdump_block_list structures and can be enumerated thru either:

  1. the pdump_object_table, an array of pdump_block_list, one per lrecord type, indexed by type number.
  2. the pdump_opaque_data_list, used for the opaque data which does not include pointers, and hence does not need descriptions.
  3. the pdump_desc_table, which is a vector of memory_description/pdump_block_list pairs, used for non-opaque C memory blocks.

This uses a marking strategy similar to the garbage collector. Some differences though:

  1. We do not use the mark bit (which does not exist for generic memory blocks anyway); we use a big hash table instead.
  2. We do not use the mark function of lrecords but instead rely on the external descriptions. This happens essentially because we need to follow pointers to generic memory blocks and opaque data in addition to Lisp_Object members.

This is done by pdump_register_object(), which handles Lisp_Object variables, and pdump_register_block() which handles generic memory blocks (C structures, arrays, etc.), which both delegate the description management to pdump_register_sub().

The hash table doubles as a map object to pdump_block_list_elmt (i.e. allows us to look up a pdump_block_list_elmt with the object it points to). Entries are added with pdump_add_block() and looked up with pdump_get_block(). There is no need for entry removal. The hash value is computed quite simply from the object pointer by pdump_make_hash().

The roots for the marking are:

  1. the staticpro’ed variables (there is a special staticpro_nodump() call for protected variables we do not want to dump).
  2. the Lisp_Object variables registered via dump_add_root_lisp_object (staticpro() is equivalent to staticpro_nodump() + dump_add_root_lisp_object()).
  3. the data-segment memory blocks registered via dump_add_root_block (for blocks with relocatable pointers), or dump_add_opaque (for "opaque" blocks with no relocatable pointers; this is just a shortcut for calling dump_add_root_block with a NULL description).
  4. the pointer variables registered via dump_add_root_block_ptr, each of which points to a block of heap memory (generally a C structure or array). Note that dump_add_root_block_ptr is not technically necessary, as a pointer variable can be seen as a special case of a data-segment memory block and registered using dump_add_root_block. Doing it this way, however, would require another level of static structures declared. Since pointer variables are quite common, dump_add_root_block_ptr is provided for convenience. Note also that internally we have to treat it separately from dump_add_root_block rather than writing the former as a call to the latter, since we don’t have support for creating and using memory descriptions on the fly – they must all be statically declared in the data-segment.

This does not include the GCPRO’ed variables, the specbinds, the catchtags, the backlist, the redisplay or the profiling info, since we do not want to rebuild the actual chain of lisp calls which end up to the dump-emacs call, only the global variables.

Weak lists and weak hash tables are dumped as if they were their non-weak equivalent (without changing their type, of course). This has not yet been a problem.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4.2 Address allocation

The next step is to allocate the offsets of each of the objects in the final dump file. This is done by pdump_allocate_offset() which is called indirectly by pdump_scan_by_alignment().

The strategy to deal with alignment problems uses these facts:

  1. real world alignment requirements are powers of two.
  2. the C compiler is required to adjust the size of a struct so that you can have an array of them next to each other. This means you can have an upper bound of the alignment requirements of a given structure by looking at which power of two its size is a multiple.
  3. the non-variant part of variable size lrecords has an alignment requirement of 4.

Hence, for each lrecord type, C struct type or opaque data block the alignment requirement is computed as a power of two, with a minimum of 2^2 for lrecords. pdump_scan_by_alignment() then scans all the pdump_block_list_elmt’s, the ones with the highest requirements first. This ensures the best packing.

The maximum alignment requirement we take into account is 2^8.

pdump_allocate_offset() only has to do a linear allocation, starting at offset 256 (this leaves room for the header and keeps the alignments happy).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4.3 The header

The next step creates the file and writes a header with a signature and some random information in it. The reloc_address field, which indicates at which address the file should be loaded if we want to avoid post-reload relocation, is set to 0. It then seeks to offset 256 (base offset for the objects).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4.4 Data dumping

The data is dumped in the same order as the addresses were allocated by pdump_dump_data(), called from pdump_scan_by_alignment(). This function copies the data to a temporary buffer, relocates all pointers in the object to the addresses allocated in step Address Allocation, and writes it to the file. Using the same order means that, if we are careful with lrecords whose size is not a multiple of 4, we are ensured that the object is always written at the offset in the file allocated in step Address Allocation.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.4.5 Pointers dumping

A bunch of tables needed to reassign properly the global pointers are then written. They are:

  1. the pdump_root_block_ptrs dynarr
  2. the pdump_opaques dynarr
  3. a vector of all the offsets to the objects in the file that include a description (for faster relocation at reload time)
  4. the pdump_root_objects and pdump_weak_object_chains dynarrs.

For each of the dynarrs we write both the pointer to the variables and the relocated offset of the object they point to. Since these variables are global, the pointers are still valid when restarting the program and are used to regenerate the global pointers.

The pdump_weak_object_chains dynarr is a special case. The variables it points to are the head of weak linked lists of lisp objects of the same type. Not all objects of this list are dumped so the relocated pointer we associate with them points to the first dumped object of the list, or Qnil if none is available. This is also the reason why they are not used as roots for the purpose of object enumeration.

Some very important information like the staticpros and lrecord_implementations_table are handled indirectly using dump_add_opaque or dump_add_root_block_ptr.

This is the end of the dumping part.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5 Reloading phase


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.1 File loading

The file is mmap’ed in memory (which ensures a PAGESIZE alignment, at least 4096), or if mmap is unavailable or fails, a 256-bytes aligned malloc is done and the file is loaded.

Some variables are reinitialized from the values found in the header.

The difference between the actual loading address and the reloc_address is computed and will be used for all the relocations.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.2 Putting back the pdump_opaques

The memory contents are restored in the obvious and trivial way.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.3 Putting back the pdump_root_block_ptrs

The variables pointed to by pdump_root_block_ptrs in the dump phase are reset to the right relocated object addresses.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.4 Object relocation

All the objects are relocated using their description and their offset by pdump_reloc_one. This step is unnecessary if the reloc_address is equal to the file loading address.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.5 Putting back the pdump_root_objects and pdump_weak_object_chains

Same as Putting back the pdump_root_block_ptrs.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.5.6 Reorganize the hash tables

Since some of the hash values in the lisp hash tables are address-dependent, their layout is now wrong. So we go through each of them and have them resorted by calling pdump_reorganize_hash_table.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

42.6 Remaining issues

The build process will have to start a post-dump xemacs, ask it the loading address (which will, hopefully, be always the same between different xemacs invocations) [[unfortunately, not true on Linux with the ExecShield feature]] and relocate the file to the new address. This way the object relocation phase will not have to be done, which means no writes in the objects and that, because of the use of mmap, the dumped data will be shared between all the xemacs running on the computer.

Some executable signature will be necessary to ensure that a given dump file is really associated with a given executable, or random crashes will occur. Maybe a random number set at compile or configure time thru a define. This will also allow for having differently-compiled xemacsen on the same system (mule and no-mule comes to mind).

The DOC file contents should probably end up in the dump file.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Aidan Kehoe on December 27, 2016 using texi2html 1.82.