On Feb 18, 2008, at 8:40 PM, yeti wrote:
On Feb 19, 4:38 am, Paul Sander <address@hidden> wrote:
For this particular metric, I usually run the two versions
through a
beautifier with standard settings, then diff the output of that.
On Feb 18, 2008, at 10:17 AM, Rick Genter wrote:
From: address@hidden
[mailto:address@hidden
On Behalf Of Ted Stern
But that regexp handles only C++ comments. I don't know of a
way to
recognize /* ... [text containing newlines] ... */. Possibly
another
diff utility has that options (xxdiff, tkdiff?).
You could write an awk or perl script to filter the multiline
comments
out, save the output to a file, then diff those files. I, however,
consider comments to be equally (or even more) important to non-
comments
in source code, and don't understand the use case.- Hide quoted
text -
- Show quoted text -
Hi guys,
Thanks for all those answers. I however thought that this would be a
fairly common problem and there might be a standard solution.
Keeping
your suggestions in mind I did
cvs diff -wlcbBC20 -r rev1 -r rev2 my_file.c | perl -0777 -pe
's{/
\*.*?\*/}{}gs' | diffstat >> FileToHoldInfo.txt
idea is to get enough context lines and then eliminate the comments
from the diff output and finally use diffstat to gather stats. Do
you
think this is the correct way ??
I think that this method will work only if the comments are
completely enclosed within the context displayed by the diff
program. It will fail (i.e., produce incorrect output), for example,
if a short sentence is added to the end of a 50-line comment. Or to
the beginning of one. Or to the middle of a 100-line comment. It
also fails if someone arbitrarily inserts or removes newlines in the
code itself.
This is where beautifiers such as the "indent" program come in. It
normalizes the format of the source code based on the syntax of the
programming language and policies specified on its command line. It
leaves comments in place, so additional filtering (like your Perl
one-
liner above) might be necessary.
After the two versions have been reduced to standard formats, you can
apply the diff program with minimal arguments. Its output can be
used to count the number of lines inserted, deleted, and changed.