bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

new module 'memxfrm'


From: Bruno Haible
Subject: new module 'memxfrm'
Date: Sat, 7 Mar 2009 13:38:35 +0100
User-agent: KMail/1.9.9

Paul Eggert has written the module 'memcoll', which generalizes the 'strcoll'
function to work on strings with embedded NULs.

Here is the generalization of 'strxfrm' to strings with embedded NUL bytes.

Jim, I think it could provide some speedup if the coreutils file
  src/sort.c
was changed to use memxfrm() and memcmp2() instead of memcoll(). The idea is
that the complicated locale dependent processing is done only once for each
input line, not each time such an input line is compared.


2009-03-07  Bruno Haible  <address@hidden>

        New module 'memxfrm'.
        * lib/memxfrm.h: New file.
        * lib/memxfrm.c: New file.
        * modules/memxfrm: New file.

================================ lib/memxfrm.h ================================
/* Locale dependent memory area transformation for comparison.
   Copyright (C) 2009 Free Software Foundation, Inc.

   This program is free software: you can redistribute it and/or modify it
   under the terms of the GNU Lesser General Public License as published
   by the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

#ifndef MEMXFRM_H
#define MEMXFRM_H

#include <stddef.h>

#ifdef __cplusplus
extern "C" {
#endif


/* Generalization of strxfrm() to strings with embedded NUL bytes.  */

/* Transform the memory area [S..S+N-1] to a memory area, in such a way that
   comparing (S1,N1) and (S2,N2) with memcoll() is equivalent to comparing
   memxfrm(S1,N1) and memxfrm(S2,N2) with memcmp2().
   The byte S[N] may be temporarily overwritten by this function, but will be
   restored before this function returns.
   The result of this function depends on the LC_COLLATE category of the
   current locale.
   If successful, return the freshly allocated transformed string and set
   *LENGTHP to its length,
   Upon failure, return NULL, with errno set.  */
extern char * memxfrm (char *s, size_t n, size_t *lengthp);


#ifdef __cplusplus
}
#endif

#endif /* MEMXFRM_H */
================================ lib/memxfrm.c ================================
/* Locale dependent memory area transformation for comparison.
   Copyright (C) 2009 Free Software Foundation, Inc.
   Written by Bruno Haible <address@hidden>, 2009.

   This program is free software: you can redistribute it and/or modify it
   under the terms of the GNU Lesser General Public License as published
   by the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public License
   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

#include <config.h>

/* Specification.  */
#include "memxfrm.h"

#include <errno.h>
#include <stdlib.h>
#include <string.h>

char *
memxfrm (char *s, size_t n, size_t *lengthp)
{
  /* Result accumulator.  */
  char *result;
  size_t length;
  size_t allocated;

  char orig_sentinel;

  /* Initial memory allocation.  */
  allocated = (n > 0 ? n : 1);
  result = (char *) malloc (allocated);
  if (result == NULL)
    goto out_of_memory_2;
  length = 0;

  /* Add sentinel.byte.  */
  orig_sentinel = s[n];
  s[n] = '\0';

  /* Iterate through S, transforming each NUL terminated segment.
     Accumulate the resulting transformed segments in result, separated by
     NULs.  */
  {
    const char *p_end = s + n + 1;
    const char *p;

    p = s;
    for (;;)
      {
        /* Search next NUL byte.  */
        const char *q = p + strlen (p);

        for (;;)
          {
            size_t k;

            errno = 0;
            k = strxfrm (result + length, p, allocated - length);
            if (errno != 0)
              goto fail;
            if (k >= allocated - length)
              {
                /* Grow the result buffer.  */
                char *new_result;

                allocated = 2 * allocated;
                new_result = (char *) realloc (result, allocated);
                if (new_result == NULL)
                  goto out_of_memory_1;
                result = new_result;
              }
            else
              {
                length += k;
                break;
              }
          }

        p = q + 1;
        if (p == p_end)
          break;
        result[length] = '\0';
        length++;
      }
  }

  /* Shrink the allocated memory if possible.  */
  if ((length > 0 ? length : 1) < allocated)
    {
      char *memory = (char *) realloc (result, length > 0 ? length : 1);
      if (memory != NULL)
        result = memory;
    }

  s[n] = orig_sentinel;
  *lengthp = length;
  return result;

 fail:
  {
    int saved_errno = errno;
    free (result);
    s[n] = orig_sentinel;
    errno = saved_errno;
    return NULL;
  }

 out_of_memory_1:
  free (result);
  s[n] = orig_sentinel;
 out_of_memory_2:
  errno = ENOMEM;
  return NULL;
}
=============================== modules/memxfrm ===============================
Description:
Locale dependent memory area transformation for comparison.

Files:
lib/memxfrm.h
lib/memxfrm.c

Depends-on:

configure.ac:

Makefile.am:
lib_SOURCES += memxfrm.c

Include:
"memxfrm.h"

License:
LGPL

Maintainer:
Bruno Haible




reply via email to

[Prev in Thread] Current Thread [Next in Thread]