[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-diffutils] New diff options/features
From: |
Duncan Moore |
Subject: |
[bug-diffutils] New diff options/features |
Date: |
Sun, 26 Sep 2010 15:56:36 +0100 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 |
I have some new options/changes for diff, which I'd like to
incorporate into diffutils:
1) --quote-filenames
Filename quoting.
2) --if-different
File contents are assumed identical if the files have the same size
and timestamp.
3) --function-width=NUM
Allow a user specified 'function width' greater than 40 characters.
4) -r
Also shows the command performed for two files at the top level.
5) Help reformatted.
6) --ignore-equivalent-strings=ERE
Ignore equivalent strings, that is, strings at the same position in
both files, matching an extended regular expression.
7) --compare=PROPERTY_LIST
File properties other than contents can be considered as a
difference - timestamps, mode etc.
I've attached a combined set of patches to the src files for diffutils
3.0, in case anyone wants to play with them. Two gnulib modules,
filemode and idcache, are also needed. I thought gnulib-tool was
supposed to do this, but I couldn't get it to work, so unfortunately
you'll have to include these modules yourselves. (I fudged it by putting
them in the src directory, modified src/Makefile.in and appended
"#define FLEXIBLE_ARRAY_MEMBER" to idcache.h).
I welcome any feedback on the changes. I also need some advice about the
best way to proceed with incorporating them into diffutils 3.0. I
believe I need to sign a FSF document to transfer copyright, which I'll
do. I would also need some help on using gnulib-tool, or at least a
pointer to some decent documentation on it, which so far I've been
unable to find.
Here's a more detailed description of the first 5 changes, which are
relatively straight forward. The --ignore-equivalent-strings and
--compare options are described in later messages, since they are more
extensive.
1) --quote-filenames
This makes filenames unambiguous by quoting them (shell style),
principally for 'patch'. For example:
% diff --quote-filenames -s -u 1 2
Files 1/aaaa and 2/aaaa are identical
Files '1/bb bb' and '2/bb bb' are identical
Files '1/cc'\''cc' and '2/cc'\''cc' are identical
diff --quote-filenames -s -u '1/dd!dd' '2/dd!dd'
--- '1/dd!dd' 2010-09-20 10:37:50.937500000 +0100
+++ '2/dd!dd' 2010-09-20 10:37:50.968750000 +0100
@@ -0,0 +1 @@
+d
Files '1/ee;ee' and '2/ee;ee' are identical
2) --if-different
Files contents are differenced only if they are likely to be different.
Specifically, if the file sizes and timestamps are identical then the
file contents are assumed to be identical too. This can significantly
speed up some recursive diffs, where one directory has been copied (say
with 'cp -pR'), and a relatively small number of files changed. The
option applies to all output styles, except side by side without
--suppress-common-lines (which has to output the whole files, even if
they are the same).
3) --function-width=NUM
The maximum number of characters to show for the 'function' line of -p
(--show-c-function) and -F (--show-function-line). The default is 40.
Despite the names of these options, the line need not be a function
definition (particularly with -F), but just some arbitrary line dividing
a file into sections. For these purposes the default of 40 is often too
small. For differencing source code, 40 characters is probably about
right, and anything longer might become inconvenient, which is why I
added an option rather than just increasing the width to 100 say.
4) -r
It's sometimes useful to have the 'diff' command echoed when two files
specified on the command line differ, as happens during recursive diffs.
The -r option has been extended to do this. For example:
% diff -r -w aaa bbb
diff -r -w aaa bbb <- the new line
1c1
< 111
---
> 222
5) --help
The help text has been reformatted, to make it easier to read (in fixed
width):
Usage: diff [OPTION]... FILES
Compare files line by line.
What to ignore or compare:
-i, --ignore-case Ignore case differences in file contents
--ignore-file-name-case Ignore case when comparing file names
--no-ignore-file-name-case
Consider case when comparing file names
-E, --ignore-tab-expansion Ignore changes due to tab expansion
-b, --ignore-space-change Ignore changes in the amount of
white space
-w, --ignore-all-space Ignore all white space
-z ERE, --ignore-equivalent-strings=ERE
Ignore equivalent strings (strings
matching ERE in both files)
-B, --ignore-blank-lines Ignore changes whose lines are all blank
-I RE, --ignore-matching-lines=RE
Ignore changes whose lines all match RE
--strip-trailing-cr Strip trailing carriage return on input
--if-different Compare file contents only if the file
sizes or timestamps are different
--compare=PROPERTY[,PROPERTY...]
Properties to compare; PROPERTY may
be any
of `content', `time', `mode', `size',
`owner', `group', `all' (for all of
these),
or `objects'; properties are applied
sequentially, a `~' prefix switching
that
property off.
Output format control:
-c, -C NUM, --context[=NUM] Output NUM (default 3) lines of copied
context
-u, -U NUM, --unified[=NUM] Output NUM (default 3) lines of unified
context
-p, --show-c-function Show which C function each change is in
-F RE, --show-function-line=RE Show the most recent line matching RE
--function-width=NUM Max. columns of function line
(default 40)
--label LABEL Use LABEL instead of file name
-q, --brief Output only whether files differ
-e, --ed Output an ed script
--normal Output a normal diff
-n, --rcs Output an RCS format diff
-y, --side-by-side Output in two columns
-W NUM, --width=NUM Output at most NUM (default 130)
columns
--left-column Output only the left side of
common lines
--suppress-common-lines Do not output common lines
-D NAME, --ifdef=NAME Output merged file to show `#ifdef NAME'
diffs
--GTYPE-group-format=GFMT
Similar, but format GTYPE input groups
with GFMT
--line-format=LFMT
Similar, but format all input lines with
LFMT
--LTYPE-line-format=LFMT
Similar, but format LTYPE input lines
with LFMT
LTYPE is `old', `new', or `unchanged'. GTYPE is LTYPE or `changed'.
GFMT may contain:
%< lines from FILE1
%> lines from FILE2
%= lines common to FILE1 and FILE2
%[-][WIDTH][.[PREC]]{doxX}LETTER printf-style spec for LETTER
LETTERs are as follows for new group, lower case for old group:
F first line number
L last line number
N number of lines = L-F+1
E F-1
M L+1
LFMT may contain:
%L contents of line
%l contents of line, excluding any trailing newline
%[-][WIDTH][.[PREC]]{doxX}n printf-style spec for input line
number
Either GFMT or LFMT may contain:
%% %
%c'C' the single character C
%c'\OOO' the character with octal code OOO
Output modification:
-l, --paginate Pass the output through `pr' to paginate it
-t, --expand-tabs Expand tabs to spaces in output
-T, --initial-tab Make tabs line up by prepending a tab
--tabsize=NUM Tab stops are every NUM (default 8) print
columns
--suppress-blank-empty Suppress space or tab before empty output
lines
--quote-filenames Quote filenames on output
-s, --report-identical-files
Report when two files are the same
Recursive/directory control:
-r, --recursive Recursively compare any
subdirectories found
-N, --new-file Treat absent files as empty
--unidirectional-new-file
Treat absent first files as empty
-x PAT, --exclude=PAT Exclude files that match PAT
-X FILE, --exclude-from=FILE Exclude files that match any pattern
in FILE
-S FILE, --starting-file=FILE Start with FILE when comparing
directories
--from-file=FILE1 Compare FILE1 to all operands;
FILE1 can be a directory.
--to-file=FILE2 Compare all operands to FILE2;
FILE2 can be a directory
Algorithm control:
--horizon-lines=NUM Keep NUM lines of the common prefix and suffix
-d, --minimal Try hard to find a smaller set of changes
--speed-large-files Assume large files and many scattered small
changes
Miscellaneuoes:
--binary Read and write data in binary mode
-a, --text Treat all files as text
-v, --version Output version info
--help Output this help
FILES are `FILE1 FILE2' or `DIR1 DIR2' or `DIR FILE...' or `FILE... DIR'.
If --from-file or --to-file is given, there are no restrictions on FILES.
If a FILE is `-', read standard input.
Exit status is 0 if inputs are the same, 1 if different, 2 if trouble.
Report bugs to: address@hidden
GNU diffutils home page: <http://www.gnu.org/software/diffutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
AAA.patch
Description: Text document
- [bug-diffutils] New diff options/features,
Duncan Moore <=