bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] Adding a real HashTable implementation to gnulib


From: Bruno Haible
Subject: Re: [RFC] Adding a real HashTable implementation to gnulib
Date: Tue, 04 Dec 2018 03:32:28 +0100
User-agent: KMail/5.1.3 (Linux/4.4.0-138-generic; KDE/5.18.0; x86_64; ; )

Hi Tim,

> We have 'hashmap' [1] in GNU Wget2.

Things I like about the implementation:

  - Collision resolution through linked list (a robust algorithm).

  - Reasonably generic API.

  - The user-definable destructor functions.

Things that could be improved:

  - The ability to set a growth policy > 0. This leads to quadratic
    run time behaviour. I mean, you make such an effort to have O(1)
    access on average, and then the fixed-size increment of the size
    turns the insertion operation into an average-O(n) operation.

  - Some functions take keysize and valuesize arguments, some don't.
    I'm confused.

  - NULL values are special. The documentation says "If there are NULL
    values in the stringmap, you should use wget_stringmap_get_null() to
    distinguish between 'not found' and 'found NULL value'." I would
    prefer an API that does not require me to think about whether I have
    null values in the map or not.

  - To iterate through the table, you need to define an instance of the
    wget_hashmap_browse_t function. I love functional programming, but
    C is not the right language for that. If the inner part (the body
    of the 'browse' function) needs to access outer variables, the
    programmer needs to manually create a "context" struct and fill it -
    things that the compiler would be doing in other programming languages.
    Some people say that the solution to this problem are the nested
    functions supported by GCC, but
      1. This is not portable; only GCC supports this.
      2. On many CPUs (including x86_64), the use of nested functions
         requires to disable on the entire process a security feature
         (namely, stacks without execute permission).
    I therefore prefer the concept of "iterator" objects that allow
    the programmer to step through the hash table without contorting
    their code and without compromising on security.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]