bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20499: [PROPOSED PATCH] C-x 8 shorthands for curved quotes, Euro, et


From: Paul Eggert
Subject: bug#20499: [PROPOSED PATCH] C-x 8 shorthands for curved quotes, Euro, etc.
Date: Thu, 07 May 2015 00:53:32 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

        I believe that both C-x 8 . and C-x 8 u are too convenient to be
        dropped without more discussion.  For one thing, · seems more
        “common” a character than İ.

In Turkish and Azerbaijani the reverse is true. And since RMS requested dotted I and dotless i my assumption was that Turkish is of some importance. Dotted sequences are the natural ways to type these characters as well as other dotted letters ĊċĖėĠġĿŀŻż in the proposal (used variously in Lithuanian, Maltese, and Polish), so there is a pretty strong case to usurp "C-x 8 .".

The case for usurping "C-x 8 u" is even stronger, since it's equivalent to the equally-short "C-x 8 m", some easily-typed symbol is needed to denote breve, and "u" looks more like breve than any other ASCII character does.

      Other than that, C-x 8 . . feels
        easier to type than C-x 8 SPC.

Good point, and I've done this in the attached patch.

 > -;;; iso-transl.el --- keyboard input definitions for ISO 8859-1  -*- 
coding: utf-8 -*-
 > +;;; iso-transl.el --- keyboard input for ISO characters -*- coding: utf-8 
-*-

        I guess we may safely state “ISO 10646” here.

Thanks, done in the attached patch.

 > +;; This package supports all characters defined by ISO 8859-1,
 > +;; along with many other Latin characters and a few other characters
 > +;; commonly used in English and basic math.

        … And may also mention it here.

Thanks, also done.

 >      ("-"    . [?­])
 > -    ("*."   . [?·])

        The removal above doesn’t seem to be strictly necessary.  The
        same for the *= and *u ones.

Thanks, fixed in the attached patch.

        … Also, did you consider generating this list automatically,
        based on the codepoint properties already known to Emacs?
        Something along the lines of the function MIMEd, which readily
        produces a list of entries for the following 133 characters.
        (Three spaces added for symmetry purposes.)

    À Á Â Ã Ä È É Ê Ë Ì Í Î Ï Ñ Ò Ó Ô Õ Ö Ù Ú Û Ü Ý
    à á â ã ä è é ê ë ì í î ï ñ ò ó ô õ ö ù ú û ü ý
    ÿ   Ā ā Ć ć Ĉ ĉ Č č Ď ď Ē ē Ě ě Ĝ ĝ Ĥ ĥ Ĩ ĩ Ī ī Ĵ ĵ Ĺ ĺ
    Ľ ľ Ń ń Ň ň Ō ō Ŕ ŕ Ř ř Ś ś Ŝ ŝ Š š Ť ť Ũ ũ Ū ū Ŵ ŵ Ŷ ŷ
    Ÿ   Ź ź Ž ž Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǧ ǧ Ǩ ǩ   ǰ Ǵ ǵ Ǹ ǹ Ș ș Ț ț
    Ȟ ȟ Ȳ ȳ

Sorry, I don't really follow the code that you attached. Although I suppose it comes from a decomposition table, I don't know what the table was designed for, and it's not clear to me how it's relevant. Anyway, most of those letters are either in iso-transl.el now, or are in the previously proposed patch. Here are the exceptional (i.e., missing even in the previously proposed patch) letters, along with some comments about these exceptions:

> Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǹ ǹ

These are for toned Pinyin but this list is incomplete. If we wanted to cover toned Pinyin, we'd also need Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ. Coming up with two-character abbreviations for all these might be tricky. Most Pinyin usage omits the tones.

> Ǧ ǧ Ǩ ǩ

These are Skolt Sami but this list is also incomplete; we'd also need Ʒ Ǥ ǥ Ǯ ǯ ʒ at least.

> ǰ

What language uses this?  I couldn't find one.

> Ǵ ǵ

Good catch. These are used for transliteration from Serbian and Macedonian. We should also include Ḱ ḱ as they are also needed. Included in the attached patch.

> Ȟ ȟ

Used in Finnish Kalo, which is quite obscure.

> Ȳ ȳ

Used in Livonian, but for that we'd also need a whole bunch of other letters, including Ǟ ǟ Ḑ ḑ Ȫ ȫ Ȭ ȭ Ȯ ȯ Ȱ and I've probably omitted some. Plus, modern Livonian doesn't seem to be using Ȳ ȳ any more....

Anyway, part of what's going on here is that the proposed list doesn't cover every Latin character in the ISO 10646 repertoire (that'd be a large set), but instead is limited to what appear to be reasonably commonly letters. Admittedly this is not universal but one must cut things off somewhere, and it would be odd to add only partial coverage for toned Pinyin, Livonian, etc.

>  > --------------090904020002020306060104
>  > Content-Type: text/x-patch;
>  >  name="0001-C-x-8-shorthands-for-curved-quotes-Euro-etc.patch"
>
>    This MIME part sure wants ‘; charset=UTF-8’.  Otherwise, Gnus
>    does no decoding, and Emacs shows the contents with the likes of
>    \304\260.

Hmm, it works for me. I use Thunderbird to read the top level message, and it spins off an Emacs to display the attachment with no problem. The web-site archive at <http://bugs.gnu.org/20499#60> also works for me with Firefox.

It's common for people to send the output of "git send-email" as attachments; if this doesn't work with Gnus I suppose a Gnus user (i.e. not me :-) should file a bug report. I looked around the net and found other Gnus users with similar problems and some code that worked for them; please see <http://bewatermyfriend.org/p/2011/00a/> and/or <http://blog.printf.net/articles/tag/emacs/>. But this stuff appeared to be several years old and this leads me to hope that maybe recent-enough Gnus versions will do the right thing already.

Attachment: 0001-C-x-8-shorthands-for-curved-quotes-Euro-etc.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]