bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12242: Emacs 24.2 RC1 build fails on OpenBSD


From: Eli Zaretskii
Subject: bug#12242: Emacs 24.2 RC1 build fails on OpenBSD
Date: Fri, 24 Aug 2012 11:46:05 +0300

> From: Chong Yidong <cyd@gnu.org>
> Cc: mituharu@math.s.chiba-u.ac.jp,  monnier@iro.umontreal.ca,  
> handa@m17n.org,  jca@wxcvbn.org,  12242@debbugs.gnu.org
> Date: Fri, 24 Aug 2012 11:25:26 +0800
> 
> > Can I have a 1-day grace to try debugging the OpenBSD crash?  Jérémie
> > generously gave me a login on the machine where it happened, and I'd
> > like a chance to try debugging it.
> 
> OK.  Good luck, and thanks for all your efforts with this bug.

I committed a fix for this as emacs-24 branch revision 108120.  It is
somewhat of a phenomenological nature, because I could not actually
catch the entire sequence of calls and actions leading to the crash,
which might have allowed me to fix the root cause where it happens, if
possible.  (It turns out OpenBSD doesn't have hardware watchpoints
support in GDB, which makes catching references to variables painfully
slow, certainly when a deadline is looming.)

I did see that the crash happens because a 'heap' structure recorded
in a memory control block maintained by ralloc.c refers to addresses
that aren't managed as part of the linked list of 'heap' structures.
It is therefore wrong to dereference such bogus 'heap' structures to
update them.  The crash happens because 'heap' pointed to memory that
was beyond our break point; however, I found that we dereference such
bogus 'heap's in more places, and survive that only because, by sheer
luck, they are still within our address space.  (ralloc.c does not
relinquish memory to the system until it has enough excess memory to
justify that.)

The solution I checked in is not to dereference 'heap' pointers that
are not in the linked list of heaps we maintain.

The fixed version survived the command that crashed and in addition 3
different bootstraps, one on OpenBSD where the crash happened and 2
more (optimized and unoptimized) on MS-Windows.  I also checked that
the MS-DOS build, which also uses ralloc.c, still works OK with the
offending commands after the patch.  Those are the only systems I have
access to which use ralloc.c.

I do suggest another pretest, to make sure this fix is solid.

I will also work on the trunk on removing calls to xmalloc inside the
functions called by maybe_unify_char, which should probably eliminate
the original problem (although they also call to make-char-table,
which still allocates memory, albeit in smaller chunks).

Thanks.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]