[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases |
Date: |
Tue, 8 Apr 2008 11:26:16 +0200 |
User-agent: |
KMail/1.5.4 |
Daniel Richard G. wrote:
> Do you honestly want your version of iconv(1) to be the odd man out that
> requires such a workaround?
This is exactly the argument used in chain letters.
> If I were writing a program that identifies and returns an encoding, then
> yes, the output should be in the canonical form. But here, you are saying
> that you care more about making users change their way of working to the
> tool, rather than the tool to the user. That the principle of "be lenient
> in what you accept, and strict in what you produce" holds no water with
> you.
Yes. Usually - when there are few implementations who need to deal with the
topic - I agree with "be lenient in what you accept". But here, there are
so many programs to deal with. If I would say yes for libiconv, then you or
someone else would make the same request for Mozilla, then another one for
mutt, then another one for Samba. Etc. etc. and 20 years from now programmers
will still be asked to add new aliases!
> If the key words
> "multiple other implementations" aren't a good enough reason for GNU
> libiconv-iconv(1) to recognize these aliases
They are not. When there are multiple implementations of a thing, and a
standard, then the standard should matter.
> Oh, if only that iconv(1) didn't buffer its entire input in memory, thereby
> rendering it useless for large files....)
Eeek, you are right. This is weird. glibc iconv reads all input into memory
before processing it. The source code has a comment
#ifdef _POSIX_MAPPED_FILES
/* We have possibilities for reading the input file. First try
to mmap() it since this will provide the fastest solution. */
but it is not even the fastest:
With iconv from libiconv, which reads the file piecemeal:
$ time iconv -f ISO-8859-1 -t UCS-2 < some-100mb-file > /dev/null
real 1m5.545s
user 0m52.919s
sys 0m6.642s
With glibc iconv:
$ time /usr/bin/iconv -f ISO-8859-1 -t UCS-2 < some-100mb-file > /dev/null
real 1m11.679s
user 0m8.046s
sys 0m2.472s
And look at 'top' while it's processing...
Can you please report it in the glibc bug tracker
http://sourceware.org/bugzilla/ ?
Bruno