bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Segfault: Lone surrogate followed by locale change


From: Eduardo A . Bustamante López
Subject: Re: Segfault: Lone surrogate followed by locale change
Date: Fri, 10 Nov 2017 07:19:43 -0600
User-agent: NeoMutt/20170609 (1.8.3)

On Fri, Nov 10, 2017 at 01:59:46PM +0100, Egmont Koblinger wrote:
[...]
> On Ubuntu Artful (glibc-2.26), this tiny snippet reproducibly crashes bash:
> 
> LC_ALL=en_US.UTF-8     # or any other UTF-8 locale
> echo -e '\ud800'       # or any other lone high or low surrogate
> LC_ALL=en_US.UTF-8     # or any available locale

I'm able to reproduce it in the `devel' branch:

(gdb) r
Starting program: /home/dualbus/src/gnu/build-bash-devel/bash 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
dualbus@ubuntu:~/src/gnu/build-bash-devel$ LC_ALL=en_US.UTF-8
dualbus@ubuntu:~/src/gnu/build-bash-devel$ echo -e '\ud800' 
���
dualbus@ubuntu:~/src/gnu/build-bash-devel$ LC_ALL=en_US.UTF-8 

Program received signal SIGSEGV, Segmentation fault.
__gconv_close (cd=0x0) at gconv_close.c:35
35      gconv_close.c: No such file or directory.
(gdb) bt
#0  __gconv_close (cd=0x0) at gconv_close.c:35
#1  0x00007ffff662eb7f in iconv_close (cd=<optimized out>) at iconv_close.c:35
#2  0x000055555576dcb8 in u32reset () at ../../../bash/lib/sh/unicode.c:102
#3  0x00005555556e9f7a in set_locale_var (var=0x603000171a00 "LC_ALL", 
value=0x602000207430 "en_US.UTF-8") at ../bash/locale.c:215
#4  0x00005555556432e1 in sv_locale (name=0x603000171a00 "LC_ALL") at 
../bash/variables.c:5671
#5  0x0000555555641c8c in stupidly_hack_special_variables (name=0x603000171a00 
"LC_ALL") at ../bash/variables.c:5280
#6  0x00005555556759a8 in do_assignment_internal (word=0x602000204770, 
expand=1) at ../bash/subst.c:3225
#7  0x0000555555675d08 in do_word_assignment (word=0x602000204770, flags=0) at 
../bash/subst.c:3263
#8  0x00005555556a335e in expand_word_list_internal (list=0x602000205d70, 
eflags=31) at ../bash/subst.c:11080
#9  0x00005555556a0b25 in expand_words (list=0x602000205d70) at 
../bash/subst.c:10635
#10 0x0000555555628701 in execute_simple_command 
(simple_command=0x603000171940, pipe_in=-1, pipe_out=-1, async=0, 
fds_to_close=0x6020002073f0)
    at ../bash/execute_cmd.c:4230
#11 0x00005555556167b4 in execute_command_internal (command=0x603000171910, 
asynchronous=0, pipe_in=-1, pipe_out=-1, fds_to_close=0x6020002073f0)
    at ../bash/execute_cmd.c:821
#12 0x0000555555614edb in execute_command (command=0x603000171910) at 
../bash/execute_cmd.c:393
#13 0x00005555555e164f in reader_loop () at ../bash/eval.c:172
#14 0x00005555555dc882 in main (argc=1, argv=0x7fffffffe138, 
env=0x7fffffffe148) at ../bash/shell.c:804

(gdb) frame 2
#2  0x000055555576dcb8 in u32reset () at ../../../bash/lib/sh/unicode.c:102
102           iconv_close (localconv);
(gdb) p localconv
$1 = (iconv_t) 0x0


The problem is that Bash treats UTF-8 as a special case, so it doesn't
initialize `localconv' to a proper value in `u32cconv', but then it calls
`iconv_close' on the uninitialized `localconv' value during the locale switch.

I think the fix looks something like this:


diff --git a/lib/sh/unicode.c b/lib/sh/unicode.c
index a6e3058f..2f64315e 100644
--- a/lib/sh/unicode.c
+++ b/lib/sh/unicode.c
@@ -272,6 +272,7 @@ u32cconv (c, s)
   if (u32init == 0)
     {
       utf8locale = locale_utf8locale;
+      localconv = (iconv_t)-1; /* initialize */
       if (utf8locale == 0)
        {
 #if HAVE_LOCALE_CHARSET



reply via email to

[Prev in Thread] Current Thread [Next in Thread]