bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug gold/18979] New: gold creates unnecessary padding within output tex


From: srk31 at srcf dot ucam.org
Subject: [Bug gold/18979] New: gold creates unnecessary padding within output text sections
Date: Thu, 17 Sep 2015 17:23:17 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=18979

            Bug ID: 18979
           Summary: gold creates unnecessary padding within output text
                    sections
           Product: binutils
           Version: 2.25
            Status: NEW
          Severity: normal
          Priority: P2
         Component: gold
          Assignee: ccoutant at gmail dot com
          Reporter: srk31 at srcf dot ucam.org
                CC: ian at airs dot com
  Target Milestone: ---

Created attachment 8617
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8617&action=edit
Test case

When comparing link maps generated by ld.bfd and ld.gold, I found that gold's
section-sorting behaviour leaves unnecessary padding (code fills) in the
output. The following Linux/x86-64 assembly program illustrates this (also in
the attached tarball). 

# Generate a start symbol with no alignment. We throw in some more nops
# to take the gap down a bit.
    .section        .text.main,"ax",@progbits
    .globl  _start
    .type   _start, @function
_start:
    movq $60, %rax          # exit 
    movq $0x0, %rdi         # zero 
    syscall 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    nop 
    .size   _start, .-_start
# Generate another symbol with 16-byte alignment, ensuring 
# that gold inserts some padding.
    .section        .text.n,"ax",@progbits
    .align 16
    .globl  n
    .type   n, @function
n:
    movl    $0, %eax
    ret
    .size   n, .-n
# Generate some bytes in .text.startup. This will initially get laid out 
# after .text.main, then gold will flip it. In the process, it will leave
# in place the (now-redundant) padding preceding n, and then insert some
# more (zeroes this time, not nops).
    .section    .text.startup, "ax", @progbits #,"aw",@progbits
    .align 16
    .globl  q
    .type   q, @function
    .size   q, 3
q:
    nop
    nop
    nop



The Makefile in the tarball builds both bfd-linked and gold-linked outputs. For
the bfd case, objdump -rd gives us


test-bfd:     file format elf64-x86-64


Disassembly of section .text:

00000000004000e0 <q>:
  4000e0:       90                      nop
  4000e1:       90                      nop
  4000e2:       90                      nop

00000000004000e3 <_start>:
  4000e3:       48 c7 c0 3c 00 00 00    mov    $0x3c,%rax
  4000ea:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  4000f1:       0f 05                   syscall 
  4000f3:       90                      nop
  4000f4:       90                      nop
  4000f5:       90                      nop
  4000f6:       90                      nop
  4000f7:       90                      nop
  4000f8:       90                      nop
  4000f9:       90                      nop
  4000fa:       90                      nop
  4000fb:       90                      nop
  4000fc:       90                      nop
  4000fd:       90                      nop
  4000fe:       90                      nop
  4000ff:       90                      nop

0000000000400100 <n>:
  400100:       b8 00 00 00 00          mov    $0x0,%eax
  400105:       c3                      retq   



... which is what I'd expect, whereas for the gold case, we get the following.
Note the 13 bytes of (elided) zeroes before <n>.


test-gold:     file format elf64-x86-64


Disassembly of section .text:

0000000000400110 <q>:
  400110:       90                      nop
  400111:       90                      nop
  400112:       90                      nop

0000000000400113 <_start>:
  400113:       48 c7 c0 3c 00 00 00    mov    $0x3c,%rax
  40011a:       48 c7 c7 00 00 00 00    mov    $0x0,%rdi
  400121:       0f 05                   syscall 
  400123:       90                      nop
  400124:       90                      nop
  400125:       90                      nop
  400126:       90                      nop
  400127:       90                      nop
  400128:       90                      nop
  400129:       90                      nop
  40012a:       90                      nop
  40012b:       90                      nop
  40012c:       90                      nop
  40012d:       90                      nop
  40012e:       90                      nop
  40012f:       0f 1f 40 00             nopl   0x0(%rax)
        ...

0000000000400140 <n>:
  400140:       b8 00 00 00 00          mov    $0x0,%eax
  400145:       c3                      retq   
  400146:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40014d:       00 00 00 





Of course excess padding is arguably not a bug. (I'd say it is, but that's
moot.) But this behaviour seems inconsistent with the comments in the code.
Stepping through the gold code in output.cc, I notice we hit code for the case
where the output section is not going to sorted, but (if I understand
correctly) any .text section is "sorted" by default, to handle .text.startup
and friends. So the first if-test should be taken (hence ensuring the other is
not), but isn't. This is in output.cc around line 2450.


  // Determine if we want to delay code-fill generation until the output
  // section is written.  When the target is relaxing, we want to delay fill
  // generating to avoid adjusting them during relaxation.  Also, if we are
  // sorting input sections we must delay fill generation.
  if (!this->generate_code_fills_at_write_
      && !have_sections_script
      && (sh_flags & elfcpp::SHF_EXECINSTR) != 0
      && parameters->target().has_code_fill()
      && (parameters->target().may_relax()
          || layout->is_section_ordering_specified()))
    {
      gold_assert(this->fills_.empty());
      this->generate_code_fills_at_write_ = true;
    }

  if (aligned_offset_in_section > offset_in_section
      && !this->generate_code_fills_at_write_
      && !have_sections_script
      && (sh_flags & elfcpp::SHF_EXECINSTR) != 0
      && parameters->target().has_code_fill())
    {
      // We need to add some fill data.  Using fill_list_ when
      // possible is an optimization, since we will often have fill
      // sections without input sections.
      off_t fill_len = aligned_offset_in_section - offset_in_section;


I notice also that the sort-time fill uses zeroes rather than nops, which seems
suspicious.

-- 
You are receiving this mail because:
You are on the CC list for the bug.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]