bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20623: XML and HTML files with encoding/charset="utf-8" declaration


From: Vincent Lefevre
Subject: bug#20623: XML and HTML files with encoding/charset="utf-8" declaration loose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Wed, 8 Aug 2018 11:47:48 +0200
User-agent: Mutt/1.10.1+58 (10c1ac4b) vl-108074 (2018-07-29)

On 2017-12-04 12:38:57 -0500, Stefan Monnier wrote:
> > Now reported with "fix this or get removed from the distribution"
> > severity at <https://bugs.debian.org/883434>.
> 
> I'm curious to see if the OP's "grave" severity settings will stick.
> "Grave" is defined in https://www.debian.org/Bugs/Developer#severities as:
> 
>     makes the package in question unusable or mostly so, or causes data
>     loss, or introduces a security hole allowing access to the accounts
>     of users who use the package.
> 
> The only part that could arguably apply is "causes data loss", but even
> that is stretching the meaning of those words, I think.

Actually there's the issue that the coding system (in Emacs sense)
is changed, but also the fact that this change is invisible to the
user (mainly because the BOM is usually not visible), which makes
the issue even worse. Basically, this is invisible data corruption.
Even though only two bytes are removed, this introduces breakage in
other applications, and it can take much time to the user to find
the cause.

Emacs should not change the coding system when not needed, and when
it needs to, it must make sure to have a confirmation from the user.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]