[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strings: 8bit chars
From: |
Nick Clifton |
Subject: |
Re: strings: 8bit chars |
Date: |
23 Dec 2002 11:38:09 +0000 |
User-agent: |
Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 |
Hi Jari,
> Strings splits strings that have 8bit chars.
> Attached file contains 2 lines and one string on each line.
>
> If this is locale based feature please do mention about it
> on man page.
Actually it is a "feature" of the strings program. It uses the
ISPRINT() macro defined in safe-ctype.h which rejects any byte with
the top bit set. Usually this is the right thing to do, since the
program is looking through binaries for printable ASCII strings.
I have created the patch below to address this problem. It adds a
new supported format to strings - the "S" format. This format accepts
8-bit characters as well as the standard 7-bit ASCII characters.
Cheers
Nick
2002-12-23 Nick Clifton <address@hidden>
* strings.c (isgraphic): Replace definition with STRING_ISGRAPHIC
macro. Handle 'S' encoding, accepting 8-bit characters.
(main): Parse 'S' encoding.
(get_char): Accept 'S' encoding.
(print_strings): Use STRING_ISGRAPHIC.
(usage): Document support of 'S' encoding.
* doc/binutils.texi: Document support of 'S' encoding/
* NEWS: Mention new feature.
Index: binutils/NEWS
===================================================================
RCS file: /cvs/src/src/binutils/NEWS,v
retrieving revision 1.30
diff -c -3 -p -w -r1.30 NEWS
*** binutils/NEWS 13 Dec 2002 13:19:44 -0000 1.30
--- binutils/NEWS 23 Dec 2002 10:32:20 -0000
***************
*** 1,5 ****
--- 1,7 ----
-*- text -*-
+ * Added 'S' encoding to strings to allow the display of 8-bit characters.
+
* Added --prefix-symbols=<text>, --prefix-sections=<text> and
--prefix-alloc-sections=<text> to objcopy.
Index: binutils/strings.c
===================================================================
RCS file: /cvs/src/src/binutils/strings.c,v
retrieving revision 1.18
diff -c -3 -p -w -r1.18 strings.c
*** binutils/strings.c 30 Nov 2002 08:39:41 -0000 1.18
--- binutils/strings.c 23 Dec 2002 10:32:20 -0000
***************
*** 39,48 ****
-o Like -to. (Some other implementations have -o like -to,
others like -td. We chose one arbitrarily.)
! --encoding={s,b,l,B,L}
! -e {s,b,l,B,L}
! Select character encoding: single-byte, bigendian 16-bit,
! littleendian 16-bit, bigendian 32-bit, littleendian 32-bit
--target=BFDNAME
Specify a non-default object file format.
--- 39,49 ----
-o Like -to. (Some other implementations have -o like -to,
others like -td. We chose one arbitrarily.)
! --encoding={s,S,b,l,B,L}
! -e {s,S,b,l,B,L}
! Select character encoding: 7-bit-character, 8-bit-character,
! bigendian 16-bit, littleendian 16-bit, bigendian 32-bit,
! littleendian 32-bit.
--target=BFDNAME
Specify a non-default object file format.
***************
*** 84,90 ****
#endif
#endif
! #define isgraphic(c) (ISPRINT (c) || (c) == '\t')
#ifndef errno
extern int errno;
--- 85,94 ----
#endif
#endif
! #define STRING_ISGRAPHIC(c) \
! ( (c) >= 0 \
! && (c) <= 255 \
! && ((c) == '\t' || ISPRINT (c) || (encoding == 'S' && (c) > 127)))
#ifndef errno
extern int errno;
*************** main (argc, argv)
*** 267,272 ****
--- 268,274 ----
switch (encoding)
{
+ case 'S':
case 's':
encoding_bytes = 1;
break;
*************** get_char (stream, address, magiccount, m
*** 460,465 ****
--- 461,467 ----
switch (encoding)
{
+ case 'S':
case 's':
r = buf[0];
break;
*************** print_strings (filename, stream, address
*** 524,530 ****
c = get_char (stream, &address, &magiccount, &magic);
if (c == EOF)
return;
! if (c > 255 || c < 0 || !isgraphic (c))
/* Found a non-graphic. Try again starting with next char. */
goto tryline;
buf[i] = c;
--- 526,532 ----
c = get_char (stream, &address, &magiccount, &magic);
if (c == EOF)
return;
! if (! STRING_ISGRAPHIC (c))
/* Found a non-graphic. Try again starting with next char. */
goto tryline;
buf[i] = c;
*************** print_strings (filename, stream, address
*** 592,598 ****
c = get_char (stream, &address, &magiccount, &magic);
if (c == EOF)
break;
! if (c > 255 || c < 0 || !isgraphic (c))
break;
putchar (c);
}
--- 594,600 ----
c = get_char (stream, &address, &magiccount, &magic);
if (c == EOF)
break;
! if (! STRING_ISGRAPHIC (c))
break;
putchar (c);
}
*************** usage (stream, status)
*** 663,670 ****
-t --radix={o,x,d} Print the location of the string in base 8, 10 or
16\n\
-o An alias for --radix=o\n\
-T --target=<BFDNAME> Specify the binary file format\n\
! -e --encoding={s,b,l,B,L} Select character size and endianness:\n\
! s = 8-bit, {b,l} = 16-bit, {B,L} = 32-bit\n\
-h --help Display this information\n\
-v --version Print the program's version number\n"));
list_supported_targets (program_name, stream);
--- 664,671 ----
-t --radix={o,x,d} Print the location of the string in base 8, 10 or
16\n\
-o An alias for --radix=o\n\
-T --target=<BFDNAME> Specify the binary file format\n\
! -e --encoding={s,S,b,l,B,L} Select character size and endianness:\n\
! s = 7-bit, S = 8-bit, {b,l} = 16-bit, {B,L} =
32-bit\n\
-h --help Display this information\n\
-v --version Print the program's version number\n"));
list_supported_targets (program_name, stream);
Index: binutils/doc/binutils.texi
===================================================================
RCS file: /cvs/src/src/binutils/doc/binutils.texi,v
retrieving revision 1.25
diff -c -3 -p -w -r1.25 binutils.texi
*** binutils/doc/binutils.texi 19 Dec 2002 14:39:30 -0000 1.25
--- binutils/doc/binutils.texi 23 Dec 2002 10:32:22 -0000
*************** octal, @samp{x} for hexadecimal, or @sam
*** 1956,1966 ****
@item -e @var{encoding}
@itemx address@hidden
Select the character encoding of the strings that are to be found.
! Possible values for @var{encoding} are: @samp{s} = single-byte
! characters (ASCII, ISO 8859, etc., default), @samp{b} = 16-bit
! Bigendian, @samp{l} = 16-bit Littleendian, @samp{B} = 32-bit Bigendian,
! @samp{L} = 32-bit Littleendian. Useful for finding wide character
! strings.
@item address@hidden
@cindex object code format
--- 1956,1966 ----
@item -e @var{encoding}
@itemx address@hidden
Select the character encoding of the strings that are to be found.
! Possible values for @var{encoding} are: @samp{s} = single-7-bit-byte
! characters (ASCII, ISO 8859, etc., default), @samp{S} =
! single-8-bit-byte characters, @samp{b} = 16-bit bigendian, @samp{l} =
! 16-bit littleendian, @samp{B} = 32-bit bigendian, @samp{L} = 32-bit
! littleendian. Useful for finding wide character strings.
@item address@hidden
@cindex object code format