--- Begin Message ---
Subject: |
Intermittent segfaults when parsing (?) custom package from repo. |
Date: |
Sun, 16 Jan 2022 15:12:08 +0900 |
User-agent: |
mblaze/1.1 |
Hey Guix,
Recently working on a package, I have been encounting intermittent segfault
during a build.
The segfault seems to only occur when I have some error in my code that causes
a crash, and the segfaults tend to cluster, appearing unexpectedly for a few
build attempts, and then disappearing right as I think they are reproducible
and try to grab an strace or something. Unfortunately, that's about the extent
of information I have been able to gleen.
The latest segfault happened with the attached package definition. Note, the
offending code is at line 77, where I forgot to remove a docstring from a
variable that used to be a procedure.
Given that non-segfault runs seem to error out so early, is this better thought
to be an issue with Guile? FWIW, I see nothing interesting under
/var/log/guix-daemon.log. However, lines like following show up in the kernel
messages ring:
[318026.268095] guix[7419]: segfault at 18 ip 00007f56ef6a01a3 sp
00007fff15588980 error 4 in libgc.so.1.4.3[7f56ef693000+1b000]
[318026.268116] Code: 8d 2d 71 93 01 00 90 4a 8d 04 e5 00 00 00 00 48 89 04 24
49 8b 45 00 4e 8b 3c e0 4d 85 ff 74 2a 31 ed 0f 1f 44 00 00 4d 89 fe <4d> 8b 7f
08 49 8b 7e 10 48 f7 d7 e8 6d 35 ff ff 85 c0 0f 84 3d 01
[318029.715621] guix[7761]: segfault at 10 ip 00007f9e80b919b9 sp
00007fffd1b2ad20 error 4 in libgc.so.1.4.3[7f9e80b7b000+1b000]
[318029.715638] Code: f7 d2 48 21 d0 48 8b 13 4c 8d 3c c5 00 00 00 00 48 8b 04
c2 48 85 c0 74 78 48 89 ea 48 f7 d2 eb 09 48 8b 40 08 48 85 c0 74 67 <48> 39 10
75 f2 44 8b 05 03 1c 04 00 49 f7 d4 4c 89 60 10 41 bc 01
[318041.537171] guix[8660]: segfault at 10 ip 00007f0d2603c9b9 sp
00007ffc72e998d0 error 4 in libgc.so.1.4.3[7f0d26026000+1b000]
[318041.537185] Code: f7 d2 48 21 d0 48 8b 13 4c 8d 3c c5 00 00 00 00 48 8b 04
c2 48 85 c0 74 78 48 89 ea 48 f7 d2 eb 09 48 8b 40 08 48 85 c0 74 67 <48> 39 10
75 f2 44 8b 05 03 1c 04 00 49 f7 d4 4c 89 60 10 41 bc 01
In the off chance it's helpful, below are some random machine details. Please
let me know if there is anything more pointed or specific I provide.
$ guix system describe
Generation 4 Jan 12 2022 18:59:48 (current)
file name: /var/guix/profiles/system-4-link
canonical file name: /gnu/store/sb01mnd31a9x2a0bznzlb2lsy91qwgk6-system
label: GNU with Linux 5.15.13
bootloader: grub-efi
root device: label: "root"
kernel: /gnu/store/bdf2yw10jr02mhyiwm05yp2qibywqz47-linux-5.15.13/bzImage
channels:
guix-bmw:
repository URL: git://git@git.wilsonb.com/guix-bmw.git
branch: master
commit: 9fb59483371bb5d59fbd27e47baac88263410ac5
nonguix:
repository URL: https://gitlab.com/nonguix/nonguix
branch: master
commit: 023508df4804dbd9f39cb197525f166bc259f995
guix:
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 9a2cf2c9232e229f7bb1ab065df2cf0740f65996
configuration file:
/gnu/store/pkf4vzlck0g32hkyvijmlcnp15vh8njv-configuration.scm
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 43 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
CPU family: 23
Model: 24
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 1
Frequency boost: enabled
CPU max MHz: 2100.0000
CPU min MHz: 1400.0000
BogoMIPS: 4191.75
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep
mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid
extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2
movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic
cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext
perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb
vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt
xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save
tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic
v_vmsave_vmload vgif overflow_recov succor smca sme sev sev_es
Virtualization: AMD-V
L1d cache: 128 KiB (4 instances)
L1i cache: 256 KiB (4 instances)
L2 cache: 2 MiB (4 instances)
L3 cache: 4 MiB (1 instance)
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled
via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and
__user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full AMD retpoline, IBPB
conditional, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
$ lsmem
RANGE SIZE STATE REMOVABLE BLOCK
0x0000000000000000-0x00000000bfffffff 3G online yes 0-23
0x0000000100000000-0x00000005bfffffff 19G online yes 32-183
Memory block size: 128M
Total online memory: 22G
Total offline memory: 0B
$ lsirq
IRQ TOTAL NAME
RES 535178549 Rescheduling interrupts
LOC 262669447 Local timer interrupts
82 125390312 PCI-MSI 2621440-edge amdgpu
CAL 66644945 Function call interrupts
TLB 28850164 TLB shootdowns
12 7316777 IO-APIC 12-edge i8042
77 2140601 PCI-MSI 2097156-edge iwlwifi:queue_4
9 1653508 IO-APIC 9-fasteoi acpi
73 1583984 PCI-MSI 2097152-edge iwlwifi:default_queue
IWI 1102407 IRQ work interrupts
65 759824 PCI-MSI 3145728-edge ahci[0000:06:00.0]
1 532374 IO-APIC 1-edge i8042
76 253951 PCI-MSI 2097155-edge iwlwifi:queue_3
75 186427 PCI-MSI 2097154-edge iwlwifi:queue_2
74 168179 PCI-MSI 2097153-edge iwlwifi:queue_1
7 100000 IO-APIC 7-fasteoi pinctrl_amd
52 41159 PCI-MSI 524289-edge nvme0q1
58 38523 PCI-MSI 524295-edge nvme0q7
56 38206 PCI-MSI 524293-edge nvme0q5
53 36960 PCI-MSI 524290-edge nvme0q2
55 30286 PCI-MSI 524292-edge nvme0q4
54 28877 PCI-MSI 524291-edge nvme0q3
57 27421 PCI-MSI 524294-edge nvme0q6
59 27142 PCI-MSI 524296-edge nvme0q8
MCP 8512 Machine check polls
35 2317 PCI-MSI 2627584-edge xhci_hcd
81 1290 PCI-MSI 2633728-edge snd_hda_intel:card1
44 1215 PCI-MSI 2629632-edge xhci_hcd
67 338 PCI-MSI 1572864-edge rtsx_pci
80 316 PCI-MSI 2623488-edge snd_hda_intel:card0
0 34 IO-APIC 2-edge timer
78 33 PCI-MSI 2097157-edge iwlwifi:exception
34 28 PCI-MSI 524288-edge nvme0q0
8 1 IO-APIC 8-edge rtc0
25 0 PCI-MSI 18432-edge PCIe PME, aerdrv
26 0 PCI-MSI 20480-edge PCIe PME, aerdrv
27 0 PCI-MSI 22528-edge PCIe PME, aerdrv
28 0 PCI-MSI 28672-edge PCIe PME, aerdrv, pciehp
29 0 PCI-MSI 133120-edge PCIe PME
30 0 PCI-MSI 135168-edge PCIe PME
36 0 PCI-MSI 2627585-edge xhci_hcd
37 0 PCI-MSI 2627586-edge xhci_hcd
38 0 PCI-MSI 2627587-edge xhci_hcd
39 0 PCI-MSI 2627588-edge xhci_hcd
40 0 PCI-MSI 2627589-edge xhci_hcd
41 0 PCI-MSI 2627590-edge xhci_hcd
42 0 PCI-MSI 2627591-edge xhci_hcd
45 0 PCI-MSI 2629633-edge xhci_hcd
46 0 PCI-MSI 2629634-edge xhci_hcd
47 0 PCI-MSI 2629635-edge xhci_hcd
48 0 PCI-MSI 2629636-edge xhci_hcd
49 0 PCI-MSI 2629637-edge xhci_hcd
50 0 PCI-MSI 2629638-edge xhci_hcd
51 0 PCI-MSI 2629639-edge xhci_hcd
60 0 PCI-MSI 524297-edge nvme0q9
61 0 PCI-MSI 524298-edge nvme0q10
62 0 PCI-MSI 524299-edge nvme0q11
63 0 PCI-MSI 524300-edge nvme0q12
71 0 PCI-MSI 2625537-edge ccp-1
79 0 PCI-MSI 1048576-edge enp2s0
NMI 0 Non-maskable interrupts
SPU 0 Spurious interrupts
PMI 0 Performance monitoring interrupts
RTR 0 APIC ICR read retries
TRM 0 Thermal event interrupts
THR 0 Threshold APIC interrupts
DFR 0 Deferred Error APIC interrupts
MCE 0 Machine check exceptions
ERR 0
MIS 0
PIN 0 Posted-interrupt notification event
NPI 0 Nested posted-interrupt event
PIW 0 Posted-interrupt wakeup event
jsoftware.scm
Description: Text Data
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#53296: Intermittent segfaults when parsing (?) custom package from repo. |
Date: |
Tue, 14 Feb 2023 13:04:52 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Hi,
>> If there’s no reliable way to reproduce it, I’ll close the bug soon.
>
> This was almost certainly a local hardware issue. The machine this happened on
> ended up showing progressively more mysterious behaviour that could not be
> reproduced on other machines.
>
> Feel free to close!
So closing!
Thanks,
simon
--- End Message ---