[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory and fails when using printf("%c") supplied with large floating point value. |
Date: |
Thu, 10 Jul 2014 22:42:02 -0700 |
User-agent: |
Heirloom mailx 12.5 6/20/10 |
Hi.
> Date: Fri, 11 Jul 2014 09:47:52 +0900
> Subject: Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of
> memory and fails when using printf("%c") supplied with large floating point
> value.
> From: green fox <address@hidden>
Do you have a real name? Just wondering.
> To: Aharon Robbins <address@hidden>
> Cc: address@hidden
>
> Just a thought, _if_ I was to write code, which patch would you prefer
> to accept?
>
> A) Routines to address the issue for handling utf-8 string when -b is at
> effect.
>
> B) Provide length(),substr(),index(),print() with extended capability to
> handle raw single byte data. (even when one is on a utf-8 system)
>
> The reason asking this, is when one is reading from a ( disk / server )
> that does not match the local character set, the current gawk setup
> fails really badly.
I'm aware of this. I don't have a good solution to this very thorny problem.
I would actually prefer that instead of a patch, you write a loadable
extension using the API defined for that purpose in the 4.1 release.
You could then contribute it to the gawkextlib project.
The manual fully documents how to write extensions. I believe that
the API gives you everything you need to write the extended
versions of the functions you desire, without having to have them
built-in to the core gawk interpreter. (If not, then that should
be discussed separately, in terms of enhancing the API.)
I think such an extension would be a valuable thing to have.
HTH,
Thanks,
Arnold