bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings: 8bit chars


From: Nick Clifton
Subject: Re: strings: 8bit chars
Date: 23 Dec 2002 11:38:09 +0000
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1

Hi Jari,

> Strings splits strings that have 8bit chars.
> Attached file contains 2 lines and one string on each line.
> 
> If this is locale based feature please do mention about it
> on man page.

Actually it is a "feature" of the strings program.  It uses the
ISPRINT() macro defined in safe-ctype.h which rejects any byte with
the top bit set.  Usually this is the right thing to do, since the
program is looking through binaries for printable ASCII strings.

I have created the patch below to address this problem.  It adds a
new supported format to strings - the "S" format.  This format accepts
8-bit characters as well as the standard 7-bit ASCII characters.

Cheers
        Nick

2002-12-23  Nick Clifton  <address@hidden>

        * strings.c (isgraphic): Replace definition with STRING_ISGRAPHIC
        macro.  Handle 'S' encoding, accepting 8-bit characters.
        (main): Parse 'S' encoding.
        (get_char): Accept 'S' encoding.
        (print_strings): Use STRING_ISGRAPHIC.
        (usage): Document support of 'S' encoding.
        * doc/binutils.texi: Document support of 'S' encoding/
        * NEWS: Mention new feature.

Index: binutils/NEWS
===================================================================
RCS file: /cvs/src/src/binutils/NEWS,v
retrieving revision 1.30
diff -c -3 -p -w -r1.30 NEWS
*** binutils/NEWS       13 Dec 2002 13:19:44 -0000      1.30
--- binutils/NEWS       23 Dec 2002 10:32:20 -0000
***************
*** 1,5 ****
--- 1,7 ----
  -*- text -*-
  
+ * Added 'S' encoding to strings to allow the display of 8-bit characters.
+ 
  * Added --prefix-symbols=<text>, --prefix-sections=<text> and
    --prefix-alloc-sections=<text> to objcopy.
  
Index: binutils/strings.c
===================================================================
RCS file: /cvs/src/src/binutils/strings.c,v
retrieving revision 1.18
diff -c -3 -p -w -r1.18 strings.c
*** binutils/strings.c  30 Nov 2002 08:39:41 -0000      1.18
--- binutils/strings.c  23 Dec 2002 10:32:20 -0000
***************
*** 39,48 ****
     -o         Like -to.  (Some other implementations have -o like -to,
                others like -td.  We chose one arbitrarily.)
  
!    --encoding={s,b,l,B,L}
!    -e {s,b,l,B,L}
!               Select character encoding: single-byte, bigendian 16-bit,
!               littleendian 16-bit, bigendian 32-bit, littleendian 32-bit
  
     --target=BFDNAME
                Specify a non-default object file format.
--- 39,49 ----
     -o         Like -to.  (Some other implementations have -o like -to,
                others like -td.  We chose one arbitrarily.)
  
!    --encoding={s,S,b,l,B,L}
!    -e {s,S,b,l,B,L}
!               Select character encoding: 7-bit-character, 8-bit-character,
!               bigendian 16-bit, littleendian 16-bit, bigendian 32-bit,
!               littleendian 32-bit.
  
     --target=BFDNAME
                Specify a non-default object file format.
***************
*** 84,90 ****
  #endif
  #endif
  
! #define isgraphic(c) (ISPRINT (c) || (c) == '\t')
  
  #ifndef errno
  extern int errno;
--- 85,94 ----
  #endif
  #endif
  
! #define STRING_ISGRAPHIC(c) \
!       (   (c) >= 0 \
!        && (c) <= 255 \
!        && ((c) == '\t' || ISPRINT (c) || (encoding == 'S' && (c) > 127)))
  
  #ifndef errno
  extern int errno;
*************** main (argc, argv)
*** 267,272 ****
--- 268,274 ----
  
    switch (encoding)
      {
+     case 'S':
      case 's':
        encoding_bytes = 1;
        break;
*************** get_char (stream, address, magiccount, m
*** 460,465 ****
--- 461,467 ----
  
    switch (encoding)
      {
+     case 'S':
      case 's':
        r = buf[0];
        break;
*************** print_strings (filename, stream, address
*** 524,530 ****
          c = get_char (stream, &address, &magiccount, &magic);
          if (c == EOF)
            return;
!         if (c > 255 || c < 0 || !isgraphic (c))
            /* Found a non-graphic.  Try again starting with next char.  */
            goto tryline;
          buf[i] = c;
--- 526,532 ----
          c = get_char (stream, &address, &magiccount, &magic);
          if (c == EOF)
            return;
!         if (! STRING_ISGRAPHIC (c))
            /* Found a non-graphic.  Try again starting with next char.  */
            goto tryline;
          buf[i] = c;
*************** print_strings (filename, stream, address
*** 592,598 ****
          c = get_char (stream, &address, &magiccount, &magic);
          if (c == EOF)
            break;
!         if (c > 255 || c < 0 || !isgraphic (c))
            break;
          putchar (c);
        }
--- 594,600 ----
          c = get_char (stream, &address, &magiccount, &magic);
          if (c == EOF)
            break;
!         if (! STRING_ISGRAPHIC (c))
            break;
          putchar (c);
        }
*************** usage (stream, status)
*** 663,670 ****
    -t --radix={o,x,d}        Print the location of the string in base 8, 10 or 
16\n\
    -o                        An alias for --radix=o\n\
    -T --target=<BFDNAME>     Specify the binary file format\n\
!   -e --encoding={s,b,l,B,L} Select character size and endianness:\n\
!                             s = 8-bit, {b,l} = 16-bit, {B,L} = 32-bit\n\
    -h --help                 Display this information\n\
    -v --version              Print the program's version number\n"));
    list_supported_targets (program_name, stream);
--- 664,671 ----
    -t --radix={o,x,d}        Print the location of the string in base 8, 10 or 
16\n\
    -o                        An alias for --radix=o\n\
    -T --target=<BFDNAME>     Specify the binary file format\n\
!   -e --encoding={s,S,b,l,B,L} Select character size and endianness:\n\
!                             s = 7-bit, S = 8-bit, {b,l} = 16-bit, {B,L} = 
32-bit\n\
    -h --help                 Display this information\n\
    -v --version              Print the program's version number\n"));
    list_supported_targets (program_name, stream);

Index: binutils/doc/binutils.texi
===================================================================
RCS file: /cvs/src/src/binutils/doc/binutils.texi,v
retrieving revision 1.25
diff -c -3 -p -w -r1.25 binutils.texi
*** binutils/doc/binutils.texi  19 Dec 2002 14:39:30 -0000      1.25
--- binutils/doc/binutils.texi  23 Dec 2002 10:32:22 -0000
*************** octal, @samp{x} for hexadecimal, or @sam
*** 1956,1966 ****
  @item -e @var{encoding}
  @itemx address@hidden
  Select the character encoding of the strings that are to be found.
! Possible values for @var{encoding} are: @samp{s} = single-byte
! characters (ASCII, ISO 8859, etc., default), @samp{b} = 16-bit
! Bigendian, @samp{l} = 16-bit Littleendian, @samp{B} = 32-bit Bigendian,
! @samp{L} = 32-bit Littleendian. Useful for finding wide character
! strings.
  
  @item address@hidden
  @cindex object code format
--- 1956,1966 ----
  @item -e @var{encoding}
  @itemx address@hidden
  Select the character encoding of the strings that are to be found.
! Possible values for @var{encoding} are: @samp{s} = single-7-bit-byte
! characters (ASCII, ISO 8859, etc., default), @samp{S} =
! single-8-bit-byte characters, @samp{b} = 16-bit bigendian, @samp{l} =
! 16-bit littleendian, @samp{B} = 32-bit bigendian, @samp{L} = 32-bit
! littleendian. Useful for finding wide character strings.
  
  @item address@hidden
  @cindex object code format




reply via email to

[Prev in Thread] Current Thread [Next in Thread]