Python: Get Unicode Name, Codepoint

By Xah Lee. Date: . Last updated: .

Get Codepoint from Char

Get character's Unicode Codepoint .

from unicodedata import *

# get codepoint of Unicode char in decimal
print(ord("→") == 8594)

Get Name from Char

Find character's Unicode name.

from unicodedata import name

print(name("→") == "RIGHTWARDS ARROW")

Get Char from Name

Get Unicode char of a given name.

from unicodedata import lookup

x1 = lookup("GREEK SMALL LETTER ALPHA")
print(x1 == "α")

x2 = lookup("RIGHTWARDS ARROW")
print(x2 == "→")

x3 = lookup("CJK UNIFIED IDEOGRAPH-5929")
print(x3 == "天")

Here's python 2:

# -*- coding: utf-8 -*-
# python 2

from unicodedata import *

char1 = lookup("GREEK SMALL LETTER ALPHA")
print(char1.encode('utf-8'))
# α

char2 = lookup("RIGHTWARDS ARROW")
print(char2.encode('utf-8'))
# 
char3 = lookup("CJK UNIFIED IDEOGRAPH-5929")
print(char3.encode('utf-8'))
# 

This page lets you search unicode. Unicode Search 😄

Print a Range of Unicode Chars

Here's a example that prints a range of Unicode chars, with their ordinal in hex, and name.

Chars without a name are skipped. (some of such are undefined codepoints.)

from unicodedata import name

xx = []

for i in range(945, 969):
    xx.append(eval('u"\\u%04x"' % i))

for x in xx:
    if name(x, "-") != "-":
        print(x, "|", "%04x" % (ord(x)), "|", name(x, "-"))

# output
# α | 03b1 | GREEK SMALL LETTER ALPHA
# β | 03b2 | GREEK SMALL LETTER BETA
# γ | 03b3 | GREEK SMALL LETTER GAMMA
# δ | 03b4 | GREEK SMALL LETTER DELTA
# ε | 03b5 | GREEK SMALL LETTER EPSILON
# ζ | 03b6 | GREEK SMALL LETTER ZETA
# η | 03b7 | GREEK SMALL LETTER ETA
# θ | 03b8 | GREEK SMALL LETTER THETA
# ι | 03b9 | GREEK SMALL LETTER IOTA
# κ | 03ba | GREEK SMALL LETTER KAPPA
# λ | 03bb | GREEK SMALL LETTER LAMDA
# μ | 03bc | GREEK SMALL LETTER MU
# ν | 03bd | GREEK SMALL LETTER NU
# ξ | 03be | GREEK SMALL LETTER XI
# ο | 03bf | GREEK SMALL LETTER OMICRON
# π | 03c0 | GREEK SMALL LETTER PI
# ρ | 03c1 | GREEK SMALL LETTER RHO
# ς | 03c2 | GREEK SMALL LETTER FINAL SIGMA
# σ | 03c3 | GREEK SMALL LETTER SIGMA
# τ | 03c4 | GREEK SMALL LETTER TAU
# υ | 03c5 | GREEK SMALL LETTER UPSILON
# φ | 03c6 | GREEK SMALL LETTER PHI
# χ | 03c7 | GREEK SMALL LETTER CHI
# ψ | 03c8 | GREEK SMALL LETTER PSI

Python Unicode