bug-libunistring
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-libunistring] ‘mem_cd_iconveh_internal ’ failure on FreeBSD and Dar


From: Ludovic Courtès
Subject: [bug-libunistring] ‘mem_cd_iconveh_internal ’ failure on FreeBSD and Darwin
Date: Thu, 31 May 2012 12:51:53 +0200
User-agent: Gnus/5.110018 (No Gnus v0.18) Emacs/24.0.93 (gnu/linux)

Hi,

The code below passes invalid UTF-32 input to ‘u32_conv_from_encoding’:

#include <uniconv.h>

/* (string->utf16 "hello, world") */
static const char u16[] =
  {
    0, 104, 0, 101, 0, 108, 0, 108, 0, 111, 0, 44, 0, 32,
    0, 119, 0, 111, 0, 114, 0, 108, 0, 100
  };

int
main ()
{
  size_t len = 0;
  const char *u32;

  u32 = u32_conv_from_encoding ("UTF-32", iconveh_question_mark,
                                u16, sizeof u16,  /* invalid UTF-32 */
                                NULL, NULL, &len);
  return 0;
}

/*
   Local Variables:
   compile-command: "gcc -o unistring-bug -lunistring unistring-bug.c"
   End:
 */
With libunistring 0.9.3 on GNU/Linux, it works as expected.  However, it
leads to an ‘abort’ on Darwin and FreeBSD:

The backtrace on FreeBSD 8.2 is:

--8<---------------cut here---------------start------------->8---
(gdb) bt full
#0  0x0000000800997fcc in kill () from /lib/libc.so.7
No symbol table info available.
#1  0x0000000800996dcb in abort () from /lib/libc.so.7
No symbol table info available.
#2  0x0000000800658764 in mem_cd_iconveh_internal (src=0x400780 "", srclen=24, 
cd=0x800e040c0, 
    cd1=0x800e04180, cd2=0xffffffffffffffff, handler=iconveh_question_mark, 
extra_alloc=0, 
    offsets=0x0, resultp=0x7fffffffe238, lengthp=0x7fffffffe230) at 
striconveh.c:485
        outptr = 0x7fffffffd074 ""
        outsize = 4092
        incremented = false
        res = 18446744073709551615
        grow = false
        inptr = 0x400798 "UTF-32"
        insize = 0
        tmp = {align = 1061109567, 
  buf = "????", '\0' <repeats 1636 times>, 
"�\213�\000\b\000\000\000\222z�\000\b\000\000\000\000&S\000\b\000\000\000�\212�\000\b\000\000\000�\212�\000\b\000\000\000�\020S\000\b\000\000\000�MP\000\b\000\000\000�\213�\000\b\000\000\000\000&S\000\b\000\000\000\a\000\000\000\000\000\000\000�rP\000\b",
 '\0' <repeats 203 times>, "|ZP\000\b", '\0' <repeats 11 times>, 
"\001\000\000\000\000\000\000\0000`S\000\b", '\0' <repeats 17 times>, "\002\000 
\021S\000\b\000\000\0000���\177\000\000\001\000\000\000\000\000\000\0000`S\000\b",
 '\0' <repeats 11 times>, "�\222�\a\000\000\000\000�\\P\000"...}
        initial_result = 0x7fffffffd070 "????"
        result = 0x7fffffffd070 "????"
        allocated = 4096
        length = 4
        last_length = 18446744073709551615
        hex = "0123456789ABCDEF"
#3  0x00000008006599fb in libunistring_mem_cd_iconveh (src=0x400780 "", 
srclen=24, 
    cd=0x7fffffffe240, handler=iconveh_question_mark, offsets=0x0, 
resultp=0x7fffffffe238, 
    lengthp=0x7fffffffe230) at striconveh.c:1011
No locals.
#4  0x0000000800659c41 in libunistring_mem_iconveh (src=0x400780 "", srclen=24, 
    from_codeset=0x400798 "UTF-32", to_codeset=0x7fffffffe320 
"UTF-8//TRANSLIT", 
    handler=iconveh_question_mark, offsets=0x0, resultp=0x7fffffffe420, 
lengthp=0x7fffffffe418)
    at striconveh.c:1095
        cd = {cd = 0x800e040c0, cd1 = 0x800e04180, cd2 = 0xffffffffffffffff}
        result = 0x0
        length = 0
        retval = 0
#5  0x000000080065a01f in mem_iconveha_notranslit (src=0x400780 "", srclen=24, 
    from_codeset=0x400798 "UTF-32", to_codeset=0x7fffffffe320 
"UTF-8//TRANSLIT", 
    handler=iconveh_question_mark, offsets=0x0, resultp=0x7fffffffe420, 
lengthp=0x7fffffffe418)
    at striconveha.c:158
        retval = 0
#6  0x000000080065a2cc in libunistring_mem_iconveha (src=0x400780 "", 
srclen=24, 
    from_codeset=0x400798 "UTF-32", to_codeset=0x8006e36d8 "UTF-8", 
transliterate=true, 
    handler=iconveh_question_mark, offsets=0x0, resultp=0x7fffffffe420, 
lengthp=0x7fffffffe418)
    at striconveha.c:238
        retval = 0
        len = 5
        to_codeset_suffixed = 0x7fffffffe320 "UTF-8//TRANSLIT"
#7  0x0000000800663645 in u8_conv_from_encoding (fromcode=0x400798 "UTF-32", 
    handler=iconveh_question_mark, src=0x400780 "", srclen=24, offsets=0x0, 
resultbuf=0x0, 
    lengthp=0x7fffffffe490) at uniconv/u8-conv-from-enc.c:89
        result = 0x0
        length = 0
#8  0x0000000800662d93 in u32_conv_from_encoding (fromcode=0x400798 "UTF-32", 
    handler=iconveh_question_mark, src=0x400780 "", srclen=24, offsets=0x0, 
resultbuf=0x0, 
    lengthp=0x7fffffffe500) at u-conv-from-enc.h:50
        utf8_string = (unistring_uint8_t *) 0x0
        utf8_length = 0
        result = (unistring_uint32_t *) 0x400798
#9  0x00000000004006bd in main ()
(gdb) list
480                         /* The input is invalid in FROM_CODESET.  Eat up 
one byte
481                            and emit a question mark.  */
482                         if (!incremented)
483                           {
484                             if (insize == 0)
485                               abort ();
486                             inptr++;
487                             insize--;
488                           }
489                         result[length] = '?';
--8<---------------cut here---------------end--------------->8---

The compiler is “gcc (GCC) 4.2.1 20070719  [FreeBSD]”.

Any ideas?

Thanks,
Ludo’.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]