bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] Re: Bug#577832: diffutils: [REGRESSION] newline is added


From: Jim Meyering
Subject: [bug-diffutils] Re: Bug#577832: diffutils: [REGRESSION] newline is added to a line that has no newline if the line is in context (fwd)
Date: Thu, 15 Apr 2010 23:36:38 +0200

Santiago Vila wrote:
> Received this from the Debian bug system today.
> This bug is not present in diffutils 2.8.1.
>
> ---------- Forwarded message ----------
> From: Timo Juhani Lindfors <address@hidden>
> To: Debian Bug Tracking System <address@hidden>
> Date: Thu, 15 Apr 2010 02:37:35 +0300
> Subject: Bug#577832: diffutils: [REGRESSION] newline is added to a line that 
> has no newline if the line is in context
>
> Package: diffutils
> Version: 1:2.9-2
> Severity: important
>
> Steps to reproduce:
> 1) echo -en '\nline1\nline2\nline3' > a
> 2) echo -en '\nnew line\n\nline1\nline2\nline3' > b
> 3) diff -u a b > c

Thanks a lot for the report and forward.
Here's a patch that fixes it, but I don't claim that
it is optimal -- I don't know diffutils well enough for that.
I hope Paul Eggert (Cc'd) can comment.

I will write at least two tests for this and update NEWS before
pushing anything tomorrow.


>From 18afd1c9d41e05810103edc351d3ca2fb71653d6 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Thu, 15 Apr 2010 23:26:22 +0200
Subject: [PATCH] diff: fix a regression when at least one input lacks a 
trailing newline,

and the final hunk plus context-length aligns exactly with the end
of the newline-lacking file.  Diff would fail to output the required
"\ No newline at end of file" line, thus rendering the output invalid.
This bug appears to have been introduced by 2006-05-07
commit 58d0483b, "(find_identical_ends): Fix huge performance bug...",
at least to the extent that reverting that change fixes the bug.
Considering the stated effect of that change and lack of metrics,
reverting it is not an option, so here we take a more direct approach.

Given these inputs,

    printf '\n1'>a; printf '\n0\n\n1'>b

and running diff like this:

    ./diff -U1 a b

for input file "b", the pointer, files[1].linbuf[4][-1], to
the last byte on the final line was mistakenly pointing at the
sentinel newline at EOF, rather than at the preceding byte.

  (gdb) p files[1].linbuf[4][-1]
  $3 = 10 '\n'

Thus, this test in the final print_1_line call:

  if ((!line_flag || line_flag[0]) && limit[-1] != '\n')
    fprintf (out, "\n\\ %s\n", _("No newline at end of file"));

would fail, because limit[-1] (which is files[1].linbuf[4][-1])
was mistakenly '\n', rather than the desired '1'.

My first thought was simply to adjust the final linbuf[line] setting,
at the end of io.c's find_and_hash_each_line function function:

       if (p == bufend)
-       break;
+       {
+         if (current->missing_newline)
+           --linbuf[line];
+         break;
+       }

But that would make diff misbehave with this input
(same as above, but with a newline appended to "a"),

    printf '\n1\n'>a; printf '\n0\n\n1'>b
    ./diff -U1 a b

due to the block (100 lines above) that is triggered in that case
(but not in the both-files-missing-newline case):

      if (p == bufend
          && current->missing_newline
          && ROBUST_OUTPUT_STYLE (output_style))
        {
          /* This line is incomplete.  If this is significant,
             put the line into buckets[-1].  */
          if (ignore_white_space < IGNORE_SPACE_CHANGE)
            bucket = &buckets[-1];

          /* Omit the inserted newline when computing linbuf later.  */
          p--;
          bufend = suffix_begin = p;
        }

Note how "p" is decremented and "bufend" adjusted.
When that happens, we certainly don't want to decrement
"bufend" yet again.

Since there is no other way to determine at the end whether "bufend"
was already decremented, add a new variable to serve as witness.

Reported by Timo Juhani Lindfors in http://bugs.debian.org/577832.
Forwarded by Santiago Vila.
---
 src/io.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/src/io.c b/src/io.c
index aeeb51d..11f9ee3 100644
--- a/src/io.c
+++ b/src/io.c
@@ -220,6 +220,7 @@ find_and_hash_each_line (struct file_data *current)
   bool same_length_diff_contents_compare_anyway =
     diff_length_compare_anyway | ignore_case;

+  bool missing_newline_fixup = false;
   while (p < suffix_begin)
     {
       char const *ip = p;
@@ -376,6 +377,7 @@ find_and_hash_each_line (struct file_data *current)
          && current->missing_newline
          && ROBUST_OUTPUT_STYLE (output_style))
        {
+         missing_newline_fixup = true;
          /* This line is incomplete.  If this is significant,
             put the line into buckets[-1].  */
          if (ignore_white_space < IGNORE_SPACE_CHANGE)
@@ -471,7 +473,14 @@ find_and_hash_each_line (struct file_data *current)
       linbuf[line] = p;

       if (p == bufend)
-       break;
+       {
+         /* If we've added a newline sentinel and did not adjust "bufend"
+            above, then linbuf[line] is now pointing at the sentinel, yet
+            should instead be pointing to the preceding byte.  */
+         if (!missing_newline_fixup && current->missing_newline)
+           --linbuf[line];
+         break;
+       }

       if (context <= i && no_diff_means_no_output)
        break;
--
1.7.1.rc1.269.ga27c7




reply via email to

[Prev in Thread] Current Thread [Next in Thread]