[Gnu-arch-users] Re: Encoding handling proposal

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: Encoding handling proposal

From:	Stefan Monnier
Subject:	[Gnu-arch-users] Re: Encoding handling proposal
Date:	30 Aug 2004 13:34:34 -0400
User-agent:	Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50

> B) "Content-Type" should be a mandatory metadata string attribute.

In keeping with the "enforce naming convention" policy of Arch, I guess that
we could just use a mime.types file to map extensions to content types.

> C) "Auto-Filter" should be a mandatory metadata boolean attribute.

As mentioned, I think this is unnecessary: Arch should keep handling files
as "sequences of bytes", just like most/all other tools do.  Meta-data has
been a recurrent theme in Unix and still hasn't appeared, so I wouldn't hold
my breath.  The current way encoding problems are solved is via tags in the
data (ui.e. the data is self-describing), which have the advantage of
blending better within the Unix world.

The various type-specific diff algorithms are only ways to optimize
changeset size and help merging, but they should all work correctly on
arbitrary binary files.

> D) There should be a filter/plugin architecture to enable a transcoding of
> files on input and output based on their content-types and user settings
> and user-provided parameters.

How is a utf-8 going to be transcoded into latin-1 without loss?

> E) Utilities such as "diff", "merge" and "annotate" (aka "blame") should be 
> provided by plugins mapped to content-types.

As mentioned by someone else, such type-specific algorithms (at least when
used for in-archive-changesets) should be "standard" within the
user community.

But I think it also makes sense to allow any wacko user-specific algorithm,
as long as it stays "for the user's eyes only", i.e. part of tla but not
part of Arch.  This gets us back to the "diff options" thread.

> G) Filenames and paths should use UTF-8 in the repository, and be transcoded
> to the proper encoding when a client accesses the local file system.

IIRC, that's basically what is planned.  For now, filenames are limited to
a subset of ASCII so the problem is currently moot.


        Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gnu-arch-users] Encoding handling proposal, (continued)
- Re: [Gnu-arch-users] Encoding handling proposal, Alexey N. Solofnenko, 2004/08/29
  - Re: [Gnu-arch-users] Encoding handling proposal, Marcus Sundman, 2004/08/29
- Re: [Gnu-arch-users] Encoding handling proposal, David Allouche, 2004/08/30
  - Re: [Gnu-arch-users] Encoding handling proposal, Marcus Sundman, 2004/08/30
- [Gnu-arch-users] Re: Encoding handling proposal, Stefan Monnier <=
- Re: [Gnu-arch-users] Encoding handling proposal, Tom Lord, 2004/08/30

Prev by Date: Re: [Gnu-arch-users] Upcoming release of 1.2.2rc1
Next by Date: Re: [Gnu-arch-users] Upcoming release of 1.2.2rc1
Previous by thread: Re: [Gnu-arch-users] Encoding handling proposal
Next by thread: Re: [Gnu-arch-users] Encoding handling proposal
Index(es):
- Date
- Thread