[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 4/7] libihash: use fast binary scaling to determine the load
From: |
Samuel Thibault |
Subject: |
Re: [PATCH 4/7] libihash: use fast binary scaling to determine the load |
Date: |
Tue, 13 May 2014 23:13:11 +0200 |
User-agent: |
Mutt/1.5.21+34 (58baf7c9f32f) (2010-12-30) |
Justus Winter, le Tue 13 May 2014 21:02:53 +0200, a écrit :
> Expressing the maximum load in binary percent (where 128b% corresponds
> to 100%) allows us to use fast binary scaling to determine if the
> maximum load has been reached without losing precision.
>
> Furthermore, the previously used expression 'ht->nr_items * 100'
> overflows int at 2^25 (unsigned int at 2^26). When a hash table
> reached that limit, it would fail to resize and fill up the table.
> This change fixes that.
Ack.
> * libihash/ihash.c (hurd_ihash_set_max_load): Update the comment.
> (hurd_ihash_add): Use fast binary scaling to determine the current
> load.
> * libihash/ihash.h (struct hurd_ihash): Update comment for max_load.
> (hurd_ihash_set_max_load): Update the comment.
> ---
> libihash/ihash.c | 37 +++++++++++++++++++++++++++----------
> libihash/ihash.h | 24 +++++++++++++-----------
> 2 files changed, 40 insertions(+), 21 deletions(-)
>
> diff --git a/libihash/ihash.c b/libihash/ihash.c
> index d628d75..f529a17 100644
> --- a/libihash/ihash.c
> +++ b/libihash/ihash.c
> @@ -180,14 +180,15 @@ hurd_ihash_set_cleanup (hurd_ihash_t ht,
> hurd_ihash_cleanup_t cleanup,
> }
>
>
> -/* Set the maximum load factor in percent to MAX_LOAD, which should be
> - between 1 and 100. The default is HURD_IHASH_MAX_LOAD_DEFAULT.
> - New elements are only added to the hash table while the number of
> - hashed elements is that much percent of the total size of the hash
> - table. If more elements are added, the hash table is first
> - expanded and reorganized. A MAX_LOAD of 100 will always fill the
> - whole table before enlarging it, but note that this will increase
> - the cost of operations significantly when the table is almost full.
> +/* Set the maximum load factor in binary percent to MAX_LOAD, which
> + should be between 64 and 128. The default is
> + HURD_IHASH_MAX_LOAD_DEFAULT. New elements are only added to the
> + hash table while the number of hashed elements is that much binary
> + percent of the total size of the hash table. If more elements are
> + added, the hash table is first expanded and reorganized. A
> + MAX_LOAD of 128 will always fill the whole table before enlarging
> + it, but note that this will increase the cost of operations
> + significantly when the table is almost full.
>
> If the value is set to a smaller value than the current load
> factor, the next reorganization will happen when a new item is
> @@ -272,8 +273,24 @@ hurd_ihash_add (hurd_ihash_t ht, hurd_ihash_key_t key,
> hurd_ihash_value_t item)
>
> if (ht->size)
> {
> - /* Only fill the hash table up to its maximum load factor. */
> - if ((ht->nr_items * 100) >> __builtin_ctz (ht->size) <= ht->max_load)
> + /* Only fill the hash table up to its maximum load factor given
> + as "binary percent", where 128b% corresponds to 100%. As the
> + size is always a power of two, and 128 is also, the quotient
> + of both is also a power of two. Therefore, we can use bit
> + shifts to scale the number of items.
> +
> + load = nr_items * 128 / size
> + = nr_items * 2^{log2 (128) - log2 (size)}
> + = nr_items >> (log2 (size) - log2 (128))
> + -- if size >= 128
> + = nr_items << (log2 (128) - log2 (size))
> + -- otherwise
> + */
> + int d = __builtin_ctz (ht->size) - 7;
> + unsigned int load = d >= 0
> + ? ht->nr_items >> d
> + : ht->nr_items << -d;
> + if (load <= ht->max_load)
> if (add_one (ht, key, item))
> return 0;
> }
> diff --git a/libihash/ihash.h b/libihash/ihash.h
> index 8829e51..809166f 100644
> --- a/libihash/ihash.h
> +++ b/libihash/ihash.h
> @@ -79,7 +79,7 @@ struct hurd_ihash
> /* The offset of the location pointer from the hash value. */
> intptr_t locp_offset;
>
> - /* The maximum load factor in percent. */
> + /* The maximum load factor in binary percent. */
> unsigned int max_load;
>
> /* When freeing or overwriting an element, this function is called
> @@ -97,8 +97,9 @@ typedef struct hurd_ihash *hurd_ihash_t;
> be a power of two. */
> #define HURD_IHASH_MIN_SIZE 32
>
> -/* The default value for the maximum load factor in percent. */
> -#define HURD_IHASH_MAX_LOAD_DEFAULT 75
> +/* The default value for the maximum load factor in binary percent.
> + 96b% is equivalent to 75%, 128b% to 100%. */
> +#define HURD_IHASH_MAX_LOAD_DEFAULT 96
>
> /* The LOCP_OFFS to use if no location pointer is available. */
> #define HURD_IHASH_NO_LOCP INTPTR_MIN
> @@ -143,14 +144,15 @@ void hurd_ihash_free (hurd_ihash_t ht);
> void hurd_ihash_set_cleanup (hurd_ihash_t ht, hurd_ihash_cleanup_t cleanup,
> void *cleanup_data);
>
> -/* Set the maximum load factor in percent to MAX_LOAD, which should be
> - between 50 and 100. The default is HURD_IHASH_MAX_LOAD_DEFAULT.
> - New elements are only added to the hash table while the number of
> - hashed elements is that much percent of the total size of the hash
> - table. If more elements are added, the hash table is first
> - expanded and reorganized. A MAX_LOAD of 100 will always fill the
> - whole table before enlarging it, but note that this will increase
> - the cost of operations significantly when the table is almost full.
> +/* Set the maximum load factor in binary percent to MAX_LOAD, which
> + should be between 64 and 128. The default is
> + HURD_IHASH_MAX_LOAD_DEFAULT. New elements are only added to the
> + hash table while the number of hashed elements is that much binary
> + percent of the total size of the hash table. If more elements are
> + added, the hash table is first expanded and reorganized. A
> + MAX_LOAD of 128 will always fill the whole table before enlarging
> + it, but note that this will increase the cost of operations
> + significantly when the table is almost full.
>
> If the value is set to a smaller value than the current load
> factor, the next reorganization will happen when a new item is
> --
> 2.0.0.rc0
>
--
Samuel
#include <culture.h>
- [PATCH 1/7] libihash: fix type of max_load, Justus Winter, 2014/05/13
- [PATCH 3/7] libihash: use linear probing and fast modulo operation, Justus Winter, 2014/05/13
- [PATCH 2/7] libihash: use an integer hash function on the keys, Justus Winter, 2014/05/13
- [PATCH 4/7] libihash: use fast binary scaling to determine the load, Justus Winter, 2014/05/13
- [PATCH 5/7] include: add lock-less reference counting primitives, Justus Winter, 2014/05/13
- [PATCH 7/7] libtrivfs: lock-less reference counting for trivfs_peropen objects, Justus Winter, 2014/05/13
- [PATCH 6/7] libdiskfs: lock-less reference counting for peropen objects, Justus Winter, 2014/05/13