What is Codepoint (Character ID)
What is Codepoint
Each Unicode character is given a unique ID. This id is a number (integer), starting at 0, called the char's codepoint. (It's not called โcharacter idโ, because some โcharacterโ are not really โcharacterโ, such as space, line return, tab, etc.)
Codepoint is represented either in decimal or Hexadecimal.
Example:
char | name | codepoint | codepoint in Hexadecimal | UTF-8 Encoding | UTF-16 Encoding |
---|---|---|---|---|---|
a | LATIN SMALL LETTER A | 97 | 61 | 61 | 61 |
ฮฑ | GREEK SMALL LETTER ALPHA | 945 | 3b1 | CE B1 | 03 B1 |
๐ | FACE WITH TEARS OF JOY | 128514 | 1f602 | F0 9F 98 82 | D8 3D DE 02 |
Standard Notation for Codepoint
The standard notation for codepoint is โU+โ followed by its codepoint in Hexadecimal. e.g.
U+3B1
How to Find a Character's Codepoint
- Paste the character in Unicode Search ๐
How to Find a Character, Given Its Codepoint
- Paste the character's codepoint in Unicode Search ๐