What is Codepoint (Character ID)

By Xah Lee. Date: . Last updated: .

What is Codepoint

Each Unicode character is given a unique ID. This id is a number (integer), starting at 0, called the char's codepoint. (It's not called โ€œcharacter idโ€, because some โ€œcharacterโ€ are not really โ€œcharacterโ€, such as space, line return, tab, etc.)

Codepoint is represented either in decimal or Hexadecimal.

Example:

Unicode Codepoint Example
charnamecodepointcodepoint in HexadecimalUTF-8 EncodingUTF-16 Encoding
aLATIN SMALL LETTER A97616161
ฮฑGREEK SMALL LETTER ALPHA9453b1CE B103 B1
๐Ÿ˜‚FACE WITH TEARS OF JOY1285141f602F0 9F 98 82D8 3D DE 02

Standard Notation for Codepoint

The standard notation for codepoint is โ€œU+โ€ followed by its codepoint in Hexadecimal. e.g.

U+3B1

How to Find a Character's Codepoint

How to Find a Character, Given Its Codepoint

Unicode and Encoding Explained