Xah Lee, 2007-10, 2010-08-13
This article discusses some issues in typography, especially those related to the dash and quotation marks
I've had some interest in typography since early 1990s of the Mac's Desktop publishing era. Basically, i avidly read books about fontography in libraries or Mac magazines such as Mac User or Mac World, and played with fonts and math typesetting in software such as Microsoft World and Mathematica, including reading Knuth's book on typography and using his TeX system, reading about font technology such as TrueType . So, i am generally acquainted with the concepts and issues of typography, though never worked in any professional area related to it.
I'll have to say, the entire typographical efforts and establishment is rather largely a waste of time, similar in the sense that some “artistic” circles chalks up photography as high art, or that grammarians and pedants have voluminous and vociferous writing style guides and guilds.
Some of the most fartful things the typography-sensitive crowd discuss or distinguish are: hyphen, en-dash, em-dash, ligature, kerning, font “design”.
In general, the function of typography is mainly about issues in printing with respect to the facilitation of reading. So, the major issues involved are: line length, line spacing, serif and sans serif fonts, margin, font sizes, and these pretty much are about it. But since how things are rendered on paper does create differences in the sense of esthetics, sometimes rather pronounced difference, thus typography does indeed have some esthetical elements. However, this is blown out of proportion to stupendous profundity.
Look at these guilded morons go:
Traditionally an em dash—like so—or spaced em dash — like so — has been used for a dash in running text. The Elements of Typographic Style recommends the more concise spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography". The spaced en dash is also the house style for certain major publishers (Penguin, Cambridge University Press, and Routledge among them). However, some longstanding typographical guides such as The Chicago Manual of Style still recommend unspaced em dashes for this purpose. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that …
The above is from Wikipedia Dash.
Here's my own rule regarding the use of dash: There are 2 kinds: the short dash and the long dash. For the short one, press the “-” key on your keyboard. For the long one — as a punctuation mark for embedded thought — press it twice. That's it. Simple and functional. And, always include a space around them. (personally, in my writings published on my site, i replace the double dash by a em-dash “—” only because it is prettier, but don't consider it important)
The character “-” you type on your keyboard is the ASCII 45. The character is named “hyphen” in the ASCII standard, but is “hyphen-minus” in Unicode. (because Unicode has now proper symbols for hyphen, figure-dash, en-dash, em-dash, (math) minus, and quite a few others)
As to the typographer's senses and sensibilities about how figure-dash should be used for numbers and en-dash is used for ranges and em-dash is for punctuation and hyphen is for word-breaking … etc, i regard them pretty much all as trifles produced by morons who's brain is inadequate to sense or tackle the depth of logic and mathematics of languages and structures but fell into a niche of diddling and went on to procure their efforts to heighten themselfs among human animals.
For hyphen, as in “breaking a word for words near the margin”, my general advice is to abolish such practice. But what to do in a narrow column of text? My general advice is to abolish the practice of layout using very narrow columns. A related concept here is typographical Justification. My general advice here is to abolish the practice of justification entirely. (leave it jagged at one end; actually as esthetically superior. (and factually functionally superior with regards to reading-facilitation))
The typographic conventions of ligatures (as in adjoining certain letter combinations such as “fi” as a single glyph fi (U+fb01)) should also be abolished.
Related here is the quotation mark. If you read Wikipedia Quotation mark, non-English usage, you'll see that there are huge variations. Here's some sample characters used for quotations and their Unicode names.
Here's a list of conventions of using the double curly quotes:
Ain't it bizarre?
For some languages, such as Chinese, it is rational how it developed into using symbols that are different from European languages's curly quotation marks (e.g. 『』「」《》〈〉【】〖〗〔〕). However, among european langs, there are extreme diversity in using the curly quotation marks. Even the American and English reverse the purpose of the single and double quotes. Some lang reverses the semantics of the left/right pair, some lang positions the mark at the bottom instead of top, some place them in opposite corners (as opposed to both on top), some lang use the same symbol for both the opening and closing marker.
One thing interesting about the curly double quotation mark pair is that the two symbols are not bilateral symmetric, but is rotational symmetric. That is, if you rotate the left one 180 degrees, you get the right one. Most other matching pairs chars are bilaterally symmetric ()[]{}«»‹›〖〗《》〈〉 (i.e. there is a horizontal line of mirror reflection). The fact that the curly quotes have only rotational symmetry, must have contributed significantly the weird diversity in their role as the choice in the opening/closing mark and whether to position them level on a line or at opposite corners. (Note that the Chinese brackets 「」『』 also lack a bilateral symmetry, however, their box-corner shape intuitively and uniquely define their placements.)
This glyph “ (unicode 8220) points upper-right. This glyph can be mirrored in a vertical line or horizontal line to create the matching variation, a total of 4 possibilities (think of p q b d).
Here are the different pointing curly quotes from Unicode: “ ” ‟.
In Unicode, i couldn't find one that is pointing to upper-left. This is
somewhat curious. (If you look at the Wikipedia article on quotation
conventions, you see that actually none uses such a char.) I created one with
image here just for the illustration:
.
The quotation mark can be placed on the upper baseline of the text (as in English convention) or lower baseline (as in the beginning quotation mark in German convention), a total of 2 possibilities.
So, 4 choices of glyph orientation, 2 possible positions, that's 8 possibilities for the opening quote. Same for the closing quote. So, the total number of styles to use the quotation punctuation with double curly quote is 8×8=64.
It is a good thing that this hasn't been exploited.
The function of quotation marks is to demarcate text, and as such delimiters, it should be a matching pair such as () [] {}, and the pair should have no more than a bilateral symmetry to reflect the natural one-dimensional (left to/from right) of written text (or, up/down in Asian langs).
If we can rewrite convention or restart history, i'd say we all just use simple left/right pairs such as ()[]{}. Since these already have a purpose, then we could use ‹›«»〈〉《》【】〖〗. The French quotation marks «» ‹› is actually the most sensible here among western langs. (though, other countries using french quotation mark also revere direction or use the same glyph for both opening and ending. This is idiocy gone berserk.)
But since we cannot restart history nor do we want to break convention radically because we'd create confusion, what i do today personally of writings published on my website, is to use the most ubiquitous convention, the American convention “like this”. (I experimented in using the French convention of «double angle quotes», but that turns out to be too in-your-face for English readers)
It is unfortunate, thru the historical development of the typewriter and the computer keyboard and ASCII, that our keyboard doesn't have the proper matching curly quotes, but instead, has the straight quotes. Here's the symbols and their Unicode name:
This creates a problem because it forces us to use the same symbol for a purpose that naturally calls for a matching pair. Using a single symbol is harder to read. Further, it causes global damage when one is missing (e.g. caused by typo or transmission error).
It would've been better, if the typewriter was designed with a matching single curly quote, like this: ‘’. This way, we get the matching property at syntax level, and we can also emulate the double curly quote by repeating the single one.
To get around the syntactical ambiguity problem of using the same char to demarcate opening and closing text in a computing context, many tech writing in software follows a convention by using the backtick (`) for the opening and the straight quote (') for the closing mark. (`like this' and ``double'') I think this convention started or is popularized by the TeX typesetting system, because that's the markup used to typeset curly quotes.
In particular, `this style' is adopted by the Free Software Foundation in their GNU Project.
Although this workaround solves a syntactical ambiguity problem, i think it is rather unnecessary and ugly. For a workaround with the constraint of ASCII for a matching quote, i would have adopted something more symmetric such as ('this') maybe or {'this'} or -'this'-. But the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and Unicode have been practical and widely available for at least 5 years), they are still using plain ASCII hacks. (in general, GNU and the Open Source morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)
There is a very stupid convention used in novel printing. In novels, often a long paragraph is entirely a dialog of a person. So, logically, the whole paragraph would be enclosed in matching quotes, and if there are a series of such paragraphs, each and every should be enclosed in matching quotes. However, this is not done because it is considered repetitious. The typography convention is to not use any ending quote, if the quoted text is of paragraph length. So, we'd have a series of paragraphs that all starts with a opening curly quotes, but is never closed.
This is another moronicity of the typographers. Such irregular tampering starts to show its problems in the computing era. Generally speaking, it makes it difficult to process the text and creates ambiguities, both for human and for machine.
Another moronity in our subject, is about the choice of glyph for apostrophe as a punctuation in English writing. For example «I'd», «he's», «James'». This is a rather big subject to tackle, dragging in the bag of grammarians and stylists and their guilds and guides and rules and exceptions, but i'll just focus on the typographical aspect of whether to use the straight quote or the curly one «I’d», «he’s», «James’».
Normally, people use the straight version because the curly one isn't available on the keyboard. However, word processor software and in publications, they use the curly version. In my opinion, we should not use the curly version for the apostrophe. Because, the single curly quote already has a logical and conventional semantics. It is used as a matching pair for quotes. (in US, single version is used for the first nested quote; in UK, the double version is for first nested quote) By using the same character for both apostrophe and closing quote, it confounds the meaning, increase the cost of computation on texts. (e.g.: «“i said: ‘he’s’.”») But also, the semantics of apostrophe as a punctuation symbol in no way calls for a slanted glyph.
The reason curly was the convention in print, is because actually we wanted a slanted apostrophe, however, the slanted version of apostrophe, the unicode char named “Prime”, is not conveniently available, while most word processors today has curly quote. We wanted a slanted one, because that's how we write it by hand. We write it by hand slanted, because that's easier, because most people are right handed, and a vertically straight one is too easy to be confused with I or 1. This is why, in print on on-screen, curly one became the convention for apostrophe.
The gist of this is that if we want to demarcate a text, the symbol used should be a matching pair, and if the semantics does not require a matching pair, we should not be using matching pair. Further, preferably, each symbol should not be used for multiple purposes.