bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] New diff options/features


From: Duncan Moore
Subject: [bug-diffutils] New diff options/features
Date: Sun, 26 Sep 2010 15:56:36 +0100
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4

I have some new options/changes for diff, which I'd like to incorporate into diffutils:

1) --quote-filenames
   Filename quoting.
2) --if-different
File contents are assumed identical if the files have the same size and timestamp.
3) --function-width=NUM
   Allow a user specified 'function width' greater than 40 characters.
4) -r
   Also shows the command performed for two files at the top level.
5) Help reformatted.
6) --ignore-equivalent-strings=ERE
Ignore equivalent strings, that is, strings at the same position in both files, matching an extended regular expression.
7) --compare=PROPERTY_LIST
File properties other than contents can be considered as a difference - timestamps, mode etc.

I've attached a combined set of patches to the src files for diffutils 3.0, in case anyone wants to play with them. Two gnulib modules, filemode and idcache, are also needed. I thought gnulib-tool was supposed to do this, but I couldn't get it to work, so unfortunately you'll have to include these modules yourselves. (I fudged it by putting them in the src directory, modified src/Makefile.in and appended "#define FLEXIBLE_ARRAY_MEMBER" to idcache.h).

I welcome any feedback on the changes. I also need some advice about the best way to proceed with incorporating them into diffutils 3.0. I believe I need to sign a FSF document to transfer copyright, which I'll do. I would also need some help on using gnulib-tool, or at least a pointer to some decent documentation on it, which so far I've been unable to find.

Here's a more detailed description of the first 5 changes, which are relatively straight forward. The --ignore-equivalent-strings and --compare options are described in later messages, since they are more extensive.

1) --quote-filenames
This makes filenames unambiguous by quoting them (shell style), principally for 'patch'. For example:

  % diff --quote-filenames -s -u 1 2
  Files 1/aaaa and 2/aaaa are identical
  Files '1/bb bb' and '2/bb bb' are identical
  Files '1/cc'\''cc' and '2/cc'\''cc' are identical
  diff --quote-filenames -s -u '1/dd!dd' '2/dd!dd'
  --- '1/dd!dd'   2010-09-20 10:37:50.937500000 +0100
  +++ '2/dd!dd'   2010-09-20 10:37:50.968750000 +0100
  @@ -0,0 +1 @@
  +d
  Files '1/ee;ee' and '2/ee;ee' are identical

2) --if-different
Files contents are differenced only if they are likely to be different. Specifically, if the file sizes and timestamps are identical then the file contents are assumed to be identical too. This can significantly speed up some recursive diffs, where one directory has been copied (say with 'cp -pR'), and a relatively small number of files changed. The option applies to all output styles, except side by side without --suppress-common-lines (which has to output the whole files, even if they are the same).

3) --function-width=NUM
The maximum number of characters to show for the 'function' line of -p (--show-c-function) and -F (--show-function-line). The default is 40. Despite the names of these options, the line need not be a function definition (particularly with -F), but just some arbitrary line dividing a file into sections. For these purposes the default of 40 is often too small. For differencing source code, 40 characters is probably about right, and anything longer might become inconvenient, which is why I added an option rather than just increasing the width to 100 say.

4) -r
It's sometimes useful to have the 'diff' command echoed when two files specified on the command line differ, as happens during recursive diffs. The -r option has been extended to do this. For example:

  % diff -r -w aaa bbb
  diff -r -w aaa bbb <- the new line
  1c1
< 111
  ---
> 222

5) --help
The help text has been reformatted, to make it easier to read (in fixed width):

Usage: diff [OPTION]... FILES
Compare files line by line.

What to ignore or compare:
  -i,     --ignore-case            Ignore case differences in file contents
          --ignore-file-name-case  Ignore case when comparing file names
          --no-ignore-file-name-case
                                   Consider case when comparing file names
  -E,     --ignore-tab-expansion   Ignore changes due to tab expansion
-b, --ignore-space-change Ignore changes in the amount of white space
  -w,     --ignore-all-space       Ignore all white space
  -z ERE, --ignore-equivalent-strings=ERE
                                   Ignore equivalent strings (strings
                                   matching ERE in both files)
  -B,     --ignore-blank-lines     Ignore changes whose lines are all blank
  -I RE,  --ignore-matching-lines=RE
                                   Ignore changes whose lines all match RE
          --strip-trailing-cr      Strip trailing carriage return on input
          --if-different           Compare file contents only if the file
                                   sizes or timestamps are different
          --compare=PROPERTY[,PROPERTY...]
Properties to compare; PROPERTY may be any
                                   of `content', `time', `mode', `size',
`owner', `group', `all' (for all of these),
                                   or `objects'; properties are applied
sequentially, a `~' prefix switching that
                                   property off.

Output format control:
-c, -C NUM, --context[=NUM] Output NUM (default 3) lines of copied context -u, -U NUM, --unified[=NUM] Output NUM (default 3) lines of unified context
    -p,    --show-c-function        Show which C function each change is in
    -F RE, --show-function-line=RE  Show the most recent line matching RE
--function-width=NUM Max. columns of function line (default 40)
           --label LABEL            Use LABEL instead of file name
  -q,         --brief          Output only whether files differ
  -e,         --ed             Output an ed script
              --normal         Output a normal diff
  -n,         --rcs            Output an RCS format diff
  -y,         --side-by-side   Output in two columns
-W NUM, --width=NUM Output at most NUM (default 130) columns --left-column Output only the left side of common lines
            --suppress-common-lines  Do not output common lines
-D NAME, --ifdef=NAME Output merged file to show `#ifdef NAME' diffs
              --GTYPE-group-format=GFMT
Similar, but format GTYPE input groups with GFMT
              --line-format=LFMT
Similar, but format all input lines with LFMT
              --LTYPE-line-format=LFMT
Similar, but format LTYPE input lines with LFMT
      LTYPE is `old', `new', or `unchanged'.  GTYPE is LTYPE or `changed'.
      GFMT may contain:
        %<  lines from FILE1
        %>  lines from FILE2
        %=  lines common to FILE1 and FILE2
        %[-][WIDTH][.[PREC]]{doxX}LETTER  printf-style spec for LETTER
          LETTERs are as follows for new group, lower case for old group:
            F  first line number
            L  last line number
            N  number of lines = L-F+1
            E  F-1
            M  L+1
      LFMT may contain:
        %L  contents of line
        %l  contents of line, excluding any trailing newline
%[-][WIDTH][.[PREC]]{doxX}n printf-style spec for input line number
      Either GFMT or LFMT may contain:
        %%  %
        %c'C'  the single character C
        %c'\OOO'  the character with octal code OOO

Output modification:
  -l, --paginate              Pass the output through `pr' to paginate it
  -t, --expand-tabs           Expand tabs to spaces in output
  -T, --initial-tab           Make tabs line up by prepending a tab
--tabsize=NUM Tab stops are every NUM (default 8) print columns --suppress-blank-empty Suppress space or tab before empty output lines
      --quote-filenames       Quote filenames on output
  -s, --report-identical-files
                              Report when two files are the same

Recursive/directory control:
-r, --recursive Recursively compare any subdirectories found
  -N,      --new-file             Treat absent files as empty
           --unidirectional-new-file
                                  Treat absent first files as empty
  -x PAT,  --exclude=PAT          Exclude files that match PAT
-X FILE, --exclude-from=FILE Exclude files that match any pattern in FILE -S FILE, --starting-file=FILE Start with FILE when comparing directories
           --from-file=FILE1      Compare FILE1 to all operands;
                                  FILE1 can be a directory.
           --to-file=FILE2        Compare all operands to FILE2;
                                  FILE2 can be a directory

Algorithm control:
      --horizon-lines=NUM  Keep NUM lines of the common prefix and suffix
  -d, --minimal            Try hard to find a smaller set of changes
--speed-large-files Assume large files and many scattered small changes

Miscellaneuoes:
      --binary             Read and write data in binary mode
  -a, --text               Treat all files as text
  -v, --version            Output version info
      --help               Output this help

FILES are `FILE1 FILE2' or `DIR1 DIR2' or `DIR FILE...' or `FILE... DIR'.
If --from-file or --to-file is given, there are no restrictions on FILES.
If a FILE is `-', read standard input.
Exit status is 0 if inputs are the same, 1 if different, 2 if trouble.

Report bugs to: address@hidden
GNU diffutils home page: <http://www.gnu.org/software/diffutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>


Attachment: AAA.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]