[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: string-ports issue on Windows
From: |
Christopher Lam |
Subject: |
Re: string-ports issue on Windows |
Date: |
Sun, 26 May 2019 18:52:16 +0800 |
Addendum - wish to confirm if guile bug (guile-2.2 on Windows):
- set locale to non-Anglo so that (setlocale LC_ALL) returns
"French_France.1252"
- call (strftime "%B" 4000000) - that's 4x10^6 -- this should return
"février 1970"
but the following error arises:
Throw to key `decoding-error' with args `("scm_from_utf8_stringn" "input
locale conversion error" 0 #vu8(102 233 118 114 105 101 114 32 49 57 55
48))'.
Is this a bug?
On Tue, 14 May 2019 at 12:42, Christopher Lam <address@hidden>
wrote:
> Hi Mark
> Final update - first, we've reused your efficient substring-replace
> function in
> https://github.com/Gnucash/gnucash/commit/7d15e6e4e727c87fb4a501e924c4ae02276e508d
> from a few years ago.
> Second, the email thread
> https://lists.gnu.org/archive/html/guile-devel/2014-03/msg00060.html
> confirmed a lot of issues in guile-2.0 could be solved in Windows by
> upgrading to guile-2.2. So, GnuCash has now upgraded to guile-2.2 on
> Windows and the string-ports are now behaving.
> Thank you (twice)
> :)
>
> On Fri, 19 Apr 2019 at 10:26, Christopher Lam <address@hidden>
> wrote:
>
>> Hi,
>> The patch *does* work and handles unicode properly :) There are
>> unintended consequences however, whereby other (probably C-based)
>> string-code in Windows are now reading the lira-symbol into unexpected
>> chars (eg lira-symbol -> "â‚°" i.e. #xe2 #x201a #xba) but this is now
>> outside the scope of this post.
>> Thank you again!
>>
>> On Thu, 18 Apr 2019 at 21:20, Mark H Weaver <address@hidden> wrote:
>>
>>> Hi again,
>>>
>>> Earlier, I wrote:
>>>
>>> > Christopher Lam <address@hidden> writes:
>>> >
>>> >> Hi Mark
>>> >> Thank you so much for looking into this.
>>> >> I'm reviewing the GnuCash for Windows package (v3.5 released April
>>> 2019)
>>> >> which contains the following libraries:
>>> >> - guile 2.0.14
>>> >
>>> > Ah, for some reason I thought you were using Guile 2.2. That explains
>>> > the problem.
>>> >
>>> > In Guile 2.0, string ports internally used the locale encoding by
>>> > default, which meant that any characters not supported by the locale
>>> > encoding would be munged.
>>> >
>>> > Guile 2.2 changed the behavior of string ports to always use UTF-8
>>> > internally, which ensures that all valid Guile strings can pass through
>>> > unmunged.
>>> >
>>> > So, this problem would almost certainly be fixed by updating to
>>> > Guile 2.2.
>>>
>>> It's probably a good idea to update to Guile 2.2 anyway, but I'd like to
>>> also offer the following workaround, which monkey patches the string
>>> port procedures in Guile 2.0 to behave more like Guile 2.2.
>>>
>>> Note that it only patches the Scheme APIs for string ports, and not the
>>> underlying C functions. It might be that some code, possibly within
>>> Guile itself, creates a string port using the C functions, and such
>>> string ports may still munge characters.
>>>
>>> Anyway, if you want to try it, arrange for GnuCash to evaluate the code
>>> below, after initializing Guile.
>>>
>>> Mark
>>>
>>>
>>> (when (string=? (effective-version) "2.0")
>>> ;; When using Guile 2.0.x, use monkey patching to change the
>>> ;; behavior of string ports to use UTF-8 as the internal encoding.
>>> ;; Note that this is the default behavior in Guile 2.2 or later.
>>> (let* ((mod (resolve-module '(guile)))
>>> (orig-open-input-string (module-ref mod 'open-input-string))
>>> (orig-open-output-string (module-ref mod 'open-output-string))
>>> (orig-object->string (module-ref mod 'object->string))
>>> (orig-simple-format (module-ref mod 'simple-format)))
>>>
>>> (define (open-input-string str)
>>> (with-fluids ((%default-port-encoding "UTF-8"))
>>> (orig-open-input-string str)))
>>>
>>> (define (open-output-string)
>>> (with-fluids ((%default-port-encoding "UTF-8"))
>>> (orig-open-output-string)))
>>>
>>> (define (object->string . args)
>>> (with-fluids ((%default-port-encoding "UTF-8"))
>>> (apply orig-object->string args)))
>>>
>>> (define (simple-format . args)
>>> (with-fluids ((%default-port-encoding "UTF-8"))
>>> (apply orig-simple-format args)))
>>>
>>> (define (call-with-input-string str proc)
>>> (proc (open-input-string str)))
>>>
>>> (define (call-with-output-string proc)
>>> (let ((port (open-output-string)))
>>> (proc port)
>>> (get-output-string port)))
>>>
>>> (module-set! mod 'open-input-string open-input-string)
>>> (module-set! mod 'open-output-string open-output-string)
>>> (module-set! mod 'object->string object->string)
>>> (module-set! mod 'simple-format simple-format)
>>> (module-set! mod 'call-with-input-string call-with-input-string)
>>> (module-set! mod 'call-with-output-string call-with-output-string)
>>>
>>> (when (eqv? (module-ref mod 'format) orig-simple-format)
>>> (module-set! mod 'format simple-format))))
>>>
>>