[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Getting Emacs to play nice with Hunspell and apostrophes

From: Yuri Khan
Subject: Re: Getting Emacs to play nice with Hunspell and apostrophes
Date: Thu, 12 Jun 2014 12:43:24 +0700

On Wed, Jun 11, 2014 at 10:20 PM, Emanuel Berg <address@hidden> wrote:

> You still haven't said one word why anyone would
> benefit from using those chars instead of the standard
> " and ' (and ...) that works everywhere and that
> everyone is familiar with (having trained their eyes
> for them year-in, year-out).

The fact that everybody uses " and ' and ` is a historical artifact, a
workaround of sorts, due to the limitations of the mechanical
typewriter. We need not be affected by it any more.

There was no possibility of including all the required typographical
characters or accented letters into the printing ball, so both quotes
(“ and ”) and the diaeresis got conflated into a straight quote ",
both single quotes (‘ and ’) into a straight single quote/apostrophe
', and the backtick ` and tilde ~ were there to facilitate typing
accented letters.

This limitation then crept into computers, because this way the
character set could be encoded in 7 bits. The computer keyboard was
just modeled after the typewriter keyboard, with a few extensions.

Then the inevitable struck: computers expanded from the US and UK into
Germany, Sweden, Finland, France, Canada, and then countries with
non-Latin scripts (Greek, Cyrillic, and CJK). And all of them wanted
to have dedicated code points for their characters, e.g. type a single
ä instead of [a, backspace-no-delete, "].

For a good while, we lived in a nightmare of ten thousand code pages.
In Russia, you could receive an email and see a jumble of utterly
meaningless words because the message could be re-encoded (or the
Content-Type charset= stripped or re-labeled) on any of the
intermediate servers; there existed programs which were able to
heuristically detect the chain of re-encodings applied on the way and
decode your message for you. You could order a book in an Internet
shop, have them completely b0rk up the encoding of the shipping
Then somebody at the postal system might decode the characters and the
package would still be delivered at the intended address.

Now that every widely used operating system supports Unicode, we don’t
have an excuse for clinging to those workarounds of the past century.
We are not limited by the 7-bit ASCII encoding and can store texts in
their true form. We also are not constrained by the typewriter
keyboard, having input methods based on Compose or Level3 allowing us
to conveniently enter all the necessary diverse characters. On
X11/GNU/Linux in particular it comes bundled with the system; on
Windows, one has to install a third-party package.

Much of the software has already evolved to support Unicode. That
which hasn’t, has to catch up. From a spell checker, in particular, I
expect that it should (perhaps with an optional switch) be able to
flag as error any spelling of “isn’t” where the character between n
and t is not the preferred apostrophe character U+2019.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]