[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. An overview of Mule-UCS

After the installation of Mule-UCS into your Emacs, you will be able to access Unicode files transparently. All that is needed is to load the `un-define' library. Mule-UCS implements rather low-level functions, and once loaded, the user should never notice that coding systems implemented via Mule-UCS are any different from those implemented in C or CCL.

Mule-UCS contains large tables, and takes about 4 seconds to load on a 450MHz Pentium III notebook. Thus if your use of Unicode is at all regular, it is recommended that the Mule-UCS Unicode coding systems be loaded by including

(require 'un-define)

in your init file. Otherwise, you must load `un-define' by hand, using load-library. Also, by default XEmacs does not autodetect Unicode. For the most common case, UTF-8, include

(set-coding-priority-list '(utf-8))
(set-coding-category-system 'utf-8 'utf-8)

in your init file. UTF-8 has a very characteristic signature; false negatives and positives should be very rare.

Autodetecting 16-bit wide-char versions of Unicode is not currently implemented in XEmacs itself. Mule-UCS provides some utilities in the `un-tools' library, but these are of unknown reliability.

Since Mule-UCS uses regular Mule code internally, and does not create an internal Mule charset for UCS, your normal input methods, whether native (Wnn), Lisp + backend (new Tamago), all in Lisp (Quail), or XIM-based (kinput2) should work with Unicode files without any change in your setup or habits. Input methods supported by terminals (cxterm, localized keyboards) should also work (if they work on the native Chinese!) as long as the terminal coding system is set properly by `set-terminal-coding-system'.

Mule-UCS was written by a Japanese and thus gives priority to Japanese by default. This means that Unicode characters that are unified from various Asian character sets (eg, the single horizontal stroke meaning "one" is present in all of them) will be presented in the Mule buffer as Japanese characters, and displayed with a Japanese font. No information will be lost or corrupted as long as you save back to Unicode. (That's what "unification" means.)

However, if you wish to use Mule-UCS to translate Unicode to national subsets other than ASCII, Latin-1, and Japanese, you must change the priorities. This also allows you to satisfy cultural preferences for glyph styles by defaulting to an appropriate font. Use `un-define-change-charset-order'. For the common case of the Latin character sets, where by international standard as well as common practice characters common to more than one character set are considered identical (not "unified" as for the Han characters in Unicode), the `latin-unity' package may be of use.

(Mule-UCS does not understand Plane 14 tags. Therefore attempts to translate multilingual texts into non-Unicode encodings such as ISO 2022 will have to be done by hand.)

That is all that most users of Mule-UCS need to know.

Mule-UCS is still under development and any problems you encounter, trivial or major, should be reported to the Mule-UCS developers. Use the standard package bug address mule-ucs-bugs@xemacs.org.

Behind the scenes

This section tries to explain what goes on behind the scenes when you visit a file encoded in Unicode with Mule-UCS.

#### to be written

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by XEmacs Webmaster on June, 14 2002 using texi2html