bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#9780: sort -u throws out non-duplicates


From: Jim Meyering
Subject: bug#9780: sort -u throws out non-duplicates
Date: Fri, 17 Aug 2012 21:53:06 +0200

Paul Eggert wrote:

> On 08/17/2012 12:36 PM, Jim Meyering wrote:
>> The first time the safe_text buffer is allocated
>> it will have to be disjoint from the line.text buffer
>> and from the buffer into which we're about to fread.
>> Thereafter, regardless of reallocation, overlap should
>> always be false.
>
> I haven't thought it through entirely, but I was
> worried about the case where there is a saved line
> but no saved_text, the buffer is reallocated, and

That is precisely what happens when this "(unique && ..." condition
is true for the first time (presuming you mean s/saved_text/safe_text/)

          /* With --unique, when we're about to read into a buffer that
             overlaps the saved "preceding" line (saved_line), copy the line's
             .text member to a realloc'd-as-needed temporary buffer and adjust
             the line's key-defining members if they're set.  */
          if (unique && overlap (ptr, readsize, &saved_line))
            {
              /* Copy saved_line.text into a buffer where it won't be clobbered
                 and if KEY is non-NULL, adjust saved_line.key* to match.  */
              static char *safe_text;
              static size_t safe_text_n_alloc;
              if (safe_text_n_alloc < saved_line.length)
                {
                  safe_text_n_alloc = saved_line.length;
                  safe_text = x2nrealloc (safe_text, &safe_text_n_alloc, 1);
                }
              memcpy (safe_text, saved_line.text, saved_line.length);
              if (key)
                {
                  #define s saved_line
                  s.keybeg = safe_text + (s.keybeg - s.text);
                  s.keylim = safe_text + (s.keylim - s.text);
                  #undef s
                }
              saved_line.text = safe_text;
            }

safe_text is initially NULL and we enter that block
only when we're about to fread into a buffer that overlaps
the current saved_line.text buffer.

In that case, we allocate an initial safe_text buffer,
copy saved_line.text into it, and update saved_line.text
to point to the just-allocated/initialized buffer.
Any test of overlap that compares that just-allocated
(or realloc'd) buffer with the about-to-be-fread-into
buffer will return false.

> then we test for overlap.  If the reallocated buffer
> does not overlap the original buffer, the test for
> overlap will fail even though the saved line needs
> to be copied into a new saved_text buffer.
>
> I'll stare at the code some more....





reply via email to

[Prev in Thread] Current Thread [Next in Thread]