[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#24603: [PATCHv5 08/11] Implement rules for title-casing Dutch ij ‘le
bug#24603: [PATCHv5 08/11] Implement rules for title-casing Dutch ij ‘letter’ (bug#24603)
Fri, 17 Mar 2017 15:43:27 +0200
> From: Michal Nazarewicz <address@hidden>
> Cc: address@hidden
> Date: Thu, 16 Mar 2017 22:30:52 +0100
> > If this is not mandated by Unicode 9.0 (and not by the latest draft of
> > 10.0, AFAICS), shouldn't we have a user option for this, by default
> > off?
> I don’t really see why.
> If the goal is to implement Unicode then ‘ij’ handling should not be
> implemented at all and Unicode-mandated behaviour should not be
> configurable, but implementing Unicode is a mean, not a goal in itself.
That's true, but my thinking was that where Unicode says something
shall be done we have more "moral authority" to implement what they
say. Whereas where there's no Unicode-mandated behavior, we should
consider the possibility that the behavior will be more controversial,
and let users decide.
> > I'm not sure I get this right: does this mean that writing in English
> > (or any other non-Dutch language) in a Dutch locale will automatically
> > capitalize "ij" to "IJ", just because the default value of
> > buffer-language is "nl_NL" or somesuch, and no specific language was
> > set for the buffer? Wouldn't that surprise users?
> Yes it does. And yes it would.
> This is currently the biggest blocker/concern for all the patches past
> 07/11 and I’m still wondering what would be the best solution.
> I thought about having a ‘language’ string property so that programming
> major modes would mark everything outside of comments as a ‘nil’
> language. This would require support from multiple major modes and
> likely complicate them.¹
> Or perhaps have off-by-default ‘special-casing-mode’ which enables
> language-dependent casing rules. Similar effect could be accomplished
> by replacing the ‘buffer-language’ with nil-by-default ‘casing-locale’
> variable applicable only to casing, but I would miss ‘buffer-language’
> since I believe it might get used for other things.
I think buffer-language is a more broad issue, so if we want to let
users control whether casing follows language rules, that should be a
separate setting, independent of the language, and it shouldn't be a
reason for not introducing buffer-language or language properties. So
with this in mind, I think your second proposal is better.
Btw, if we do introduce such properties, I think their values should
be symbols, not strings, like we already do, for example, with
'charset' property we put on text decoded by some coding-systems. And
such a property will indeed allow more fine-grained language-dependent
behavior, provided that we find good ways of computing this property
according to user expectations.
bug#24603: [PATCHv5 05/11] Support casing characters which map into multiple code points (bug#24603), Michal Nazarewicz, 2017/03/09
bug#24603: [PATCHv5 00/11] Casing improvements, Eli Zaretskii, 2017/03/11
bug#24603: [PATCHv6 0/6] Casing improvements, language-independent part, Michal Nazarewicz, 2017/03/20
- bug#24603: [PATCHv5 04/11] Split up casify_region function (bug#24603), (continued)
- bug#24603: [PATCHv5 04/11] Split up casify_region function (bug#24603), Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 07/11] Introduce ‘buffer-language’ buffer-locar variable, Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 02/11] Introduce case_character function, Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 01/11] Split casify_object into multiple functions, Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 10/11] Implement casing rules for Lithuanian (bug#24603), Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 08/11] Implement rules for title-casing Dutch ij ‘letter’ (bug#24603), Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 09/11] Implement Turkic dotless and dotted i casing rules (bug#24603), Michal Nazarewicz, 2017/03/09
- bug#24603: [PATCHv5 11/11] Implement Irish casing rules (bug#24603), Michal Nazarewicz, 2017/03/09