Semantics and Symbols: Examples of Unicode Symbols Usage

By Xah Lee. Date: . Last updated: .

This page is a misc collection of short expositions on Unicode symbols semantics and usage.

Unicode Symbol for See, See Also

Decided to adopt a new Unicode symbol usage. On my site, i have hundreds links of the form: (See: link title.) . So, a article may have lots โ€œ(See: โ€ฆ)โ€, sometimes a few in a paragraph. It gets annoying. I decided to replace the โ€œ(See: โ€ฆ)โ€ with just โ€œ(โžฒ โ€ฆ)โ€, and get rid of the ending period.

Note that the Unicode char

It displays fine by default in all major browsers. This is a important point to consider when you want to adopt some Unicode char. [see Unicode: Arrows โ†’ โžต โž› โžฒ โžค]

Originally, i thought of using a eye icon. The one that i found are:

but these characters are new in Unicode 6, and they do not show in browsers except Firefox on my machine. Also, they don't really fit. The eyeglasses is OK but in normal sized font it's illegible. Then, i thought of using

โ˜ž U+261E: WHITE RIGHT POINTING INDEX

This char shows up but again not legible without larger font. This is a important point, because it ruled out many other iconographic symbols. I don't want to use HTML markup to enlarge the font, because that adds bulk and complexity and is distracting.

Then i thought of using a Egyptian eye icon.

It is quickly ruled out because none of them shows up.

Then i thought of just using a arrow, like this: (โ†’ Syntax Design: Use of Unicode Matching Brackets as Specialized Delimiters), but that left paren and the arrow combo made it look like a emoticon. This is another interesting discovery. This means, certain sequence of symbols may create out-of-context side-effects that you want to avoid. (similar to the rise of ligatures)

So, in the end, i thought some other arrow might work. Thus i ends up with โžฒ. This is not the final choice, i might change it to something else in the future.

Why do i change the more clear โ€œ(See: โ€ฆ)โ€ to a icon? It's not a critical change, nor a necessarily one. The change does not matter much to readers (in fact, probably reduce the meaning of the text a tiny bit. (because when you use English word โ€œseeโ€, the meaning is clear and the usage idiomatic. But with a symbolic icon, it became rather cryptic.)). But the advantage with the change is that it makes the text more systematic, and amenable for parsing. (kinda like a micro markup)

2013-08-15 addendum. I decided to use this

2014-08-14 addendum. I decided to use this

change'd my site's โ€œsee alsoโ€ unicode marker from [โ˜› link] to [โžค link]

the impetus of my change is that, on the โ€œMotorola Xoom tabletโ€ Buy at amazon , the pointing finger โ˜› isn't showing. Btw, that is traditional symbol for index, and also carries the semantic of See Also. but fell out of use as a standard printer's punctuation. (you can read about it on wikipedia)

since it's not showing, i changed to the arrow thing โžค instead. The advantage is that it more intuitively indicates to modern readers as a โ€œsee alsoโ€ mark. And shows in all devices, as far as i know. The shape is simpler than the pointing index finger, so shows more clearly in devices. The minor disadvantage is that it doesn't have the classic semantic of โ€œSee Alsoโ€.

Use of Unicode Subscript Digit Characters

Found a new use of Unicode subscript characters. These: {โ‚€ โ‚ โ‚‚ โ‚ƒ โ‚„ โ‚… โ‚† โ‚‡ โ‚ˆ โ‚‰}. In many of my web articles, they are divided into several pages. For example, titled: โ€œAlgorithmic Mathematical Art (page 1)โ€, and the second page would be titled โ€œAlgorithmic Mathematical Art (page 2)โ€, etc. I find those โ€œ(page 1)โ€ etc too verbose and distracting. Now i just use subscripts. Like this:

Am not completely satisfied. I think i should add a subscript โ€œpโ€ there too, like these: {โ‚šโ‚ โ‚šโ‚‚ โ‚šโ‚ƒ}, so it makes the semantics more unique. With just subscript digits, it's still too syntactically ambiguous, because subscript digits could appear in lots other places for different purposes. But for now, i'll let it be.

as of today, i've reverted. Using the subscript does not work in some tablet or mobile phone, and is too much in-your-face, have problem with search engines.

Unicode use: The Wave Dash ใ€œ

This is the nth episode of Xah's rectification of typographical convention!

[see The Writing Style of Xah Lee]

[see The Moronicities of Typography: Hyphen, Dash, Quotation Marks, Apostrophe]

Today, i decided to use the Unicode WAVE DASH ใ€œ for date range.

Traditionally, it's done by a EN DASH โ€“. However, it has ambiguity problems with hyphen and minus sign. This is especially important in scientific contexts. Quote from Wikipedia Dash:

The Guide for the Use of the International System of Units (SI) recommends that when a number range might be misconstrued as subtraction, the word โ€œtoโ€ should be used instead of an en dash. For example, โ€œa voltage of 50 V to 100 Vโ€ is preferable to using โ€œa voltage of 50โ€“100 Vโ€. It is also considered inappropriate to use the en dash in place of the words to or and in phrases that follow the forms from โ€ฆ to โ€ฆ and between โ€ฆ and โ€ฆ.[9][10]

The sources [9] [10] are:

A couple years ago ~2009, i tested in browsers about displaying the wave dash. At the time, some browser doesn't show the char. But today, all major browsers do. At the time, i decided to use the FIGURE DASH โ€’ for date range. Now, i replaced all of them on my site to the wave dash. About 480 occurrences.

For examples where date range happens a lot, see:

Incorrect Glyph in Font for Wave Dash

Note that the wave dash ใ€œ should go up first then down like a long TILDE ~. However, there was some mixup in the character history among encodings, so that some font design got the character inverted. For example, here's Arial Unicode MS ใ€œ. (you can see it only if you have font โ€œArial Unicode MSโ€ installed.)

Use of ALMOST EQUAL TO โ‰ˆ

Also, i often use the TILDE ~

in front of a year to indicate approximate date or quantity, for example, {โ€œI use Dvorak layout since ~1993โ€, โ€œPlace a ~5 ใŽ thick book in front of the keyboardโ€}. I've been trying to find a proper symbol for that. The closest is the

But am not sure about that because that symbol should between 2 quantities, as a relation, in some strict sense. Anyhow, today i decided to adopt this symbol in front of a date/number to indicate approximate/in-exact date/quantity. Not completely satisfied, but it's still better than the promiscuous tilde. For files with several use of โ‰ˆ, see:

Unicode Approx Equal โ‰ˆ vs Tilde ~

xah lee site unicode approx equal replacement 2017 04 13
Replace all Unicode approx equal char back to tilde of my articles. e.g. โ‰ˆ1986 โ†’ ~1986. About 1900 replacements in 766 files.
xah lee site unicode approx equal replacement 2017 04 13 95442
emacs xah-find.el in action. [see Emacs: xah-find.el, Find Replace in Pure Elisp]

Unicode Characters for Centimeter

Discovered special Unicode chars for Centimeter.

Note: cm is used a lot in Taiwan, but not in USA. In USA, if metric length is used at all, it's more common to see millimeter (mm).

Unicode, Encoding, Escape Sequence, Issues