bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-ASCII characters in @include search path


From: Patrice Dumas
Subject: Re: Non-ASCII characters in @include search path
Date: Sat, 26 Feb 2022 22:49:07 +0100

On Sat, Feb 26, 2022 at 09:29:15PM +0000, Gavin Smith wrote:
> On Sat, Feb 26, 2022 at 9:11 PM Patrice Dumas <pertusus@free.fr> wrote:
> > The whole output file is encoded, the problem is that you encoded
> > $image_file, it should not be, it is assumed to be decoded from the
> > document. image_path could be encoded, but then the encoding should be
> > passed such that it can be re-decoded, for error messages, for instance.
> 
> It would probably be easier to do it the way you said and decode all
> the file names and encode them just before use. It's too confusing
> otherwise, even if doing it that way would give a little more
> flexibility for non-UTF-8 input files and locales (assuming we
> actually did it properly, and didn't ever break it by mistake).
> 
> I looked at HTML.pm and found it hard to understand where variables or
> functions had the word "filename" in them, what exactly this referred
> to, if it was supposed to be the encoded or unencoded filename
> (encoded for creating and finding files, unencoded for linking to
> them). I imagine this would be confusing on an ongoing basis if it
> meant both in different places.

To me the easiest way to avoid confusion is to have everything as
character strings and only encode when a file name is needed for an
operation on the file system (stat, with -e, open, readdir...).  That
way it is ok in any case.

As a side note, normally, a marker of working on file paths is the
use of File::Spec.  But I am not sure if it is really done
systematically and correctly.

> I expect non-ASCII, non-UTF-8 filenames would be fairly rare, but if
> there is some use case where they don't work as intended in whatever
> we implement, there could be customization variables to control
> encoding and decoding of filenames to support these cases.

This should be easily done by using Texinfo::Common::encode_file_name
systematically when encoding strings as file names, and using
in this function customization variables.

-- 
Pat



reply via email to

[Prev in Thread] Current Thread [Next in Thread]