HomeMathComputingArtsWordsLiteratureMusictwitter facebook webfeed

Character Sets and Encoding in HTML

Advertise Here For Profit

Xah Lee, 2005-12, 2011-01, 2011-03-27

In HTML, you can declare the Character Set for the file. Like this:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

If you don't understand what is Character Set and Encoding, see: UNICODE Basics: What's Character Encoding, UTF-8, and All That?.

Once you declared your character set, you can have characters from that character set in your html file. There is a character set standand called Unicode, which contains basically all the world's language's characters, including the thousands of Chinese characters. Here is a sample of characters from Unicode:

€£¥ ©®™¶ † ‡“”—‘’ éåøèü θπαβγλ →←↑↓↔↗ ■□•‣♥★☆ ±≤≥≠≈ ∞∆° ℂℝℚℙℤ ∀∃ ∫∑∏≔⊂⊃⊆⊇∈ ⊕⊗ 한국어 ひらがな カタカナ العربية русский 李杀网

For more examples, see: Unicode Characters Example.

Using Character Entity

Another way to show special characters in your file is by so-called “character entity”. For example, the bullet symbol • is unicode character number 8226. In HTML, you can write it as &#8226;. Here's what your browser shows: •

The number 8226 in hexadecimal is 2022. Sometimes you only knew the hexadecimal form. You can write it using hexadecimal like this &#x2022;. Here's what your browser shows: •

For some commonly used character, HTML provides named entity for them. For example, the bullet character can be written as &bull;. Here's what your browser shows: •

For a complete list of named entities, see: HTML/XML Entities (Character/Unicode/Symbol) List.

References and Notes:

blog comments powered by Disqus