HomeMathComputingArtsWordsLiteratureMusictwitter facebook webfeed

Emacs and Unicode Tips

Advertise Here For Profit

Xah Lee, 2006-07, …, 2010-10-27, 2011-01-28

This page gives some tips about using emacs and unicode. If you work in 2 languages, or type a lot math symbols, you'll find this page useful.

Carbon emacs 22 unicode

A screenshot of emacs window showing unicode chars. You can download this text here: unicode.txt.

This page covers Emacs version 22 (released in 2007) and emacs 23 (released in 2009-07). You should use emacs 23 if unicode is important to you, because emacs 23 uses unicode as its internal encoding and also support OS fonts. (See: New Features in Emacs 23)

Typing Unicode Characters

How to type this character é ?

Here's a table on how to type these chars:

CharacterKey Press
éCtrl+x 8 ' e
àCtrl+x 8 ` a
îCtrl+x 8 ^ i
ñCtrl+x 8 ~ n
üCtrl+x 8 " u

To see all characters you can type this way, press 【Ctrl+x 8 Ctrl+h】. Examples: ¿ ¡ ¢ £ ¥ ¤ § ¶ ® © ª «» × ÷ ¬ ° ± µ ÀÁÂÃÄÅÆ Ç ÈÉÊË ÌÍÎÏ ÐÑ ÒÓÔÕÖ ØÙÚÛÜÝÞß àáâãäåæç èéêë ìíîï ðñòóôõö øùúûüýþÿ.

If you need to type these chars often, you can set your input method to “latin-9-prefix”. (type 【Alt+x set-input-method】). That will allow you to type these chars without typing 【Ctrl+x 8】 first.

(Emacs's “latin-9-prefix” corresponds to the char set IEC 8859-9)

If you are on a Mac, these characters can be typed by holding down the Option key or use Character Palette. On Windows, you can use Windows Alt keycodes or Charmap.

mac unicode char

Mac OS X Keyboard Viewer

How to insert a unicode character by name?

With emacs version 23, type 【Ctrl+x 8 Enter】 (ucs-insert), then the name of the unicode. For example, try insert “→”. Its name is “RIGHTWARDS ARROW”.

You can use asterisk * to match chars. For example, call “ucs-insert”, then type *arrow then Tab, then emacs will show all chars with “arrow” in their names.

How to insert a unicode character by its hex value?

Type 【Ctrl+x 8 Enter】 (ucs-insert), then the hex of the unicode. For example, try insert “→”. Its hex value is 2192.

Alternatively, press 【Alt+x set-input-method】 and give a value “ucs”. Once you are in the ucs input method, you can type “u”, followed by a hex value. Emacs will then insert the unicode char with that hex value.

To turn off input method, press 【Ctrl+\】 (toggle-input-method). For example, try typing the Greek lower case alpha “α” by its hex value 03B1.

If you have the decimal value of a unicode char, you can first find its hex value. You can do this by using the build-in calculator. Suppose your character in decimal is 945. Now, type 【Alt+x calc】 to start calc. Then type “945” then Enter. Now, type “d6”, which puts calc in a hex mode. You can read off the screen that the hex value is 3B1. To put calc back to decimal mode, type “d0”. To quit calc, type “q”.

How to open a unicode character palette?

You can put frequently used unicode chars into a file and save it, and define a keystroke to open this file, so that you can copy and paste the chars you want. Here's how you can define a keystroke to open a file. Put the following in your emacs init file.

; open my unicode template with F8 key
(global-set-key (kbd "<f8>")
  (lambda () (interactive) (find-file "~/my_unicode_template.txt")))

Here's a example of a template: unicode.txt.

You can also install the xub Unicode Browser mode. It lets you easily find the char you want.

How to set a keystroke to insert a unicode char?

If you have some characters that you use often, you can make emacs inserting them with a single keypress. For example, put the following code in your emacs init file, then, each time you press the 6 key on the number pad, a arrow will be inserted.

(global-set-key (kbd "<kp-6>") "→") ; the 6 key on numeric keypad

You can also set shortcut by key sequence. Like this:

(global-set-key (kbd "M-i a") "α")
(global-set-key (kbd "M-i b") "β")

With the above, typing 【Alt+i a】 will insert α. This way you can set a whole collection of unicode chars.

Alternatively, you can use key-translation-map:

(define-key key-translation-map (kbd "M-i a") (kbd "α"))
(define-key key-translation-map (kbd "M-i b") (kbd "β"))

For the difference, see: Emacs: Remapping Keys Using key-translation-map.

How to use abbrev to input unicode chars?

Put the following in your emacs init file:

(define-abbrev-table 'global-abbrev-table '(
    ("alpha" "α" nil 0)
    ("beta" "β" nil 0)
    ("gamma" "γ" nil 0)
    ("theta" "θ" nil 0)
    ("inf" "∞" nil 0)

    ("ar1" "→" nil 0)
    ("ar2" "⇒" nil 0)
    ))

(abbrev-mode 1) ; turn on abbrev mode

Select the code above and type 【Alt+x eval-region】.

Now, type alpha , it will become “α ”.

See: Using Emacs's Abbrev Mode for Abbreviation.

A System for Inputting Hundreds of Math Symbols

If you do math a lot, use Emacs Math Symbols Input Mode (xmsi-mode).

Typing Chinese or Non-Latin Languages

How to type Chinese?

Regardless what text editor you are using, you need to do two things: (1) Set your editor's Character encoding to one that supports your language. (2) set your Input method to a particular system suitable for your language.

Char Encoding tells your computer how to map symbols/glyphs/characters into binary code. Input Method allows you to type languages that are not based on Latin alphabet. (For example, in Chinese, you cannot just type a character by pressing a key, instead, you must use a input method to type Chinese.) For English and most European langs, you don't need to worry about input method.

To set your file encoding in emacs, use the menu 〖Options▸Mule (Multilingual Environment)▸Set Language Environment〗.

To set your input method, use the menu 〖Options▸Mule (Multilingual Environment)▸Select Input Method…〗.

After you've pulled the menu, be sure to also pull the menu command 〖Options▸Save Options〗 so that emacs remembers your settings.

For me, i type Chinese sometimes. There are several encoding systems that supports Chinese, for example GB 18030 (used in China), Big5 (popular in Taiwan), UTF-8 and UTF-16. I use the UTF-8 encoding system. Among the Chinese input methods, i use the Pinyin method. Here's how to set them in emacs without using the menu: 【Alt+x set-language-environment UTF-8】 and 【Alt+x set-input-method chinese-py】.

Here's a example of actually typing the Chinese char 美 (meaning beautiful). Type 【Alt+x set-input-method Enter chinese-py】, then type “mei”. Emacs will show you a list of characters with the pronunciation of mei. Type “2” to pick the correct character. Then, emacs will insert the character. To turn off input method, press 【Ctrl+\】.

A in-depth tutorial of using Mac with Chinese is at: http://www.yale.edu/chinesemac/. It includes comprehensive info and resources on Chinese fonts, complete tutorials on several Chinese input methods, etc.

How to find out what's the current input method?

Type 【Ctrl+h v】 (describe-variable) then “current-input-method”.

Finding Info About a Character

I have this character α on the screen. How to find out its unicode's hex value or name?

You can find out a char's info by placing your cursor on the character then type 【Alt+x describe-char】.

Following is the output of “describe-char” on char “★” in Emacs 23:

        character: ★ (9733, #o23005, #x2605)
preferred charset: unicode (Unicode (ISO10646))
       code point: 0x2605
           syntax: _ 	which means: symbol
         category: .:Base, c:Chinese, h:Korean, j:Japanese
      buffer code: #xE2 #x98 #x85
        file code: #xE2 #x98 #x85 (encoded by coding system utf-8-unix)
          display: by this font (glyph code)
    uniscribe:-outline-BatangChe-normal-normal-normal-mono-13-*-*-*-c-*-gb2312.1980*-* (#xB18)

Character code properties: customize what to show
  name: BLACK STAR
  general-category: So (Symbol, Other)

For emacs 22, it won't display the char's name. Also, Unicode version 6 (released in 2010-10) added about 1k more symbols. Emacs 23.x does not have info on these new symbols. (e.g. 😸 GRINNING CAT FACE WITH SMILING EYES) (See: Unicode 6 Emoticons)

You can get these info by downloading a unicode data file and let emacs know where it is. Download it at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt, then, place the following code in your “.emacs”.

;; set unicode data file location. (used by what-cursor-position and describe-char)
(let ((x "~/web/xahlee_org/emacs/UnicodeData.txt"))
  (when (file-exists-p x)
    (setq describe-char-unicodedata-file x)))

Select the above code, then call “eval-region”. Then, you will have full unicode char info when calling “describe-char”.

See also: xub Unicode Browser mode for Emacs.

blog comments powered by Disqus