Re: [Bug-wget] bad filenames (again)

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] bad filenames (again)

From:	Andries E. Brouwer
Subject:	Re: [Bug-wget] bad filenames (again)
Date:	Wed, 12 Aug 2015 14:38:15 +0200
User-agent:	Mutt/1.5.21 (2010-09-15)

Hi Tim,

> Just a few questions.
> 
> 1.
> Why don't you use 'opt.locale' to check if the local encoding is UTF-8 ?

I thought that was usable only if ENABLE_IRI was defined.

> 2. 
> I don't understand how you distinguish between illegal and legal UTF-8 
> sequences. I guess only legal sequences should be unescaped. 
> Or to make it easy: if the string is valid UTF-8, do not escape.
> If it is not valid UTF-8, escape it.
> You could:
> Add unistr/u8-check to bootstrap.conf (./bootstrap thereafter),
> include #include "unistr.h" and use
> if (u8_check (s, strlen(s)) == 0) to test for validity.

Yes, I expected you to say something like this.

My reason: I consider this escaping a very doubtful activity.
In my eyes the correct code is not: always escape except when UTF-8,
but rather: never escape except perhaps when someone asks for it.
So the precise check for UTF-8 is in my eyes just bloat.

Moreover: what to do if the name is not valid UTF-8?
The current escaping produces something that not valid UTF-8.
So doing the current escaping is certainly a mistake, not better
than using the name as-is. Invent a new type of escaping?

So, for the time being, my previous patch avoided the old mistake,
without introducing new mistakes :-).

Andries

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/06
- Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/07
  - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/07
    - Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/07
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/07
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/09
    - Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/12
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer <=
    - Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/12
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/12
    - Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/13
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/13
    - Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/16
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/16
    - Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/16
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
    - Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/17
    - Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17

Prev by Date: Re: [Bug-wget] [PATCH] Use u8_check() instead our own utf8 checking
Next by Date: Re: [Bug-wget] bad filenames (again)
Previous by thread: Re: [Bug-wget] bad filenames (again)
Next by thread: Re: [Bug-wget] bad filenames (again)
Index(es):
- Date
- Thread