[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: MPS: Please check if scratch/igc builds with native compilation
From: |
Andrea Corallo |
Subject: |
Re: MPS: Please check if scratch/igc builds with native compilation |
Date: |
Tue, 21 May 2024 14:17:19 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Gerd Möllmann <gerd.moellmann@gmail.com> writes:
> Andrea Corallo <acorallo@gnu.org> writes:
>
>> At least here the error seems reproducible. Bootstrapping with -j1
>> makes native compiling leim/ja-dic/ja-dic.el always fail.
>>
>> And if I run it under gdb I see we get a SIGSEGV in
>> 'maybe_resize_hash_table' at fns.c:4987
>>
>> memcpy (key, h->key, old_size * sizeof *key);
>
> That's a new one for me. Maybe you are hitting a read/write barrier?
Ah right maybe, interesting!
> I think Eli & Helmut can help here with what to do for the signals in
> GDB. (On macOS, MPS is using Mach exceptions, not signals.)
>
>>
>> with the following bt
>
>
>
>>
>> (gdb) bt
>> #0 maybe_resize_hash_table (h=0x7fffe7dabd48) at fns.c:4987
>> #1 hash_put (h=0x7fffe7dabd48, key=XIL(0x7fffe4fc297b), value=XIL(0x30),
>> hash=1644298) at fns.c:5162
>> #2 0x0000555555817fc0 in Fputhash (key=XIL(0x7fffe4fc297b),
>> value=XIL(0x30), table=<optimized out>) at fns.c:5993
>> #3 0x00007ffff14f6313 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #4 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc010) at
>> eval.c:3032
>> #5 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #6 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc0d0) at
>> eval.c:3032
>> #7 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #8 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc190) at
>> eval.c:3032
>> #9 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #10 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc250) at
>> eval.c:3032
>> #11 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #12 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc310) at
>> eval.c:3032
>> #13 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #14 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc3d0) at
>> eval.c:3032
>> #15 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #16 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc490) at
>> eval.c:3032
>> #17 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #18 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc550) at
>> eval.c:3032
>> #19 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #20 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc610) at
>> eval.c:3032
>> #21 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #22 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc6d0) at
>> eval.c:3032
>> #23 0x00007ffff14f6476 in
>> F627974652d72756e2d2d73747269702d6c697374_byte_run__strip_list_0 () at
>> /home/andcor03/emacs4/src/../native-lisp/30.0.50-00c2e4a4/preloaded/byte-run-79ff048e-d52588ab.eln
>> #24 0x00005555557fdbac in Ffuncall (nargs=2, args=0x7fffffffc760) at
>> eval.c:3032
>> #25 0x00007ffff14f692c in
>> F627974652d72756e2d73747269702d73796d626f6c2d706f736974696f6e73_byte_run_strip_symbol_positions_0
>> ()
>> [...]
>>
>> Which is admittedly different to what I saw from command line.
>>
>>> To debug this, I changed the check in igc.c to not assert, but print
>>> the PID, and enter an endless loop sleeping. This makes it possible to
>>> attach to the process with LLDB.
>>>
>>> In all cases I investigated in this way, I'm seeing a pattern: What is
>>> happening is that a function in the Emacs core is called from a
>>> native-compiled function. Things look like, simplified,
>>>
>>> /* In some .eln */
>>> Lisp_Object d_reloc[100];
>>>
>>> Lisp_Object some_native_compiled_lisp_function ()
>>> {
>>> Lisp_Object frame[2];
>>> frame[0] = d_reloc[17]; // some symbol
>>> frame[1] = ...
>>> f_reloc->funcall (2, frame);
>>> }
>>>
>>> where f_reloc is a large struct with function pointer members for
>>> function being called from the .eln. Doesn't matter. We then land in
>>> Ffuncall in the Emacs core, and the first element of its args vector,
>>> a symbol, is found to be forwarded which leads to the assertion.
>>>
>>> d_reloc in the .eln is scanned in igc.c, and it being on the control
>>> stack, in frame[], or in a register, should pin it, one would assume.
>>> So how comes Ffuncall in Emacs receives an invalid symbol?
>>>
>>> I've checked that d_reloc is indeed scanned by fix_comp_unit. The
>>> check gives me reasonable confidence that this "should work". But as
>>> an alternative, I also made all the things like d_reloc in the .elns
>>> ambiguous roots, so that they cannot possibly be moved, if all works as
>>> expected.
>>>
>>> - No change, it still asserts in the same way.
>>>
>>> - Changing optimization levels - no change.
>>> - Changing from arm64 to x86_64 - no change.
>>
>> That's very bizarre, I've hard time believing we are hitting such a bug :/
>> Hope we are missing something.
>
> Yes, bizarre is a good description. I'm out of ideas.
Do you think is very difficult to debug MPS to understand why a certain
object is being moved (while it should not)? On GNU/Linux we can record
the rr trace (so that everything is reproducible) and do some back and
forward to try to spread some light on this maybe?
Andrea
- MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Andrea Corallo, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Andrea Corallo, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation,
Andrea Corallo <=
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Andrea Corallo, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Andrea Corallo, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Eli Zaretskii, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Gerd Möllmann, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Eli Zaretskii, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Helmut Eller, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Andrea Corallo, 2024/05/21
- Re: MPS: Please check if scratch/igc builds with native compilation, Helmut Eller, 2024/05/21