[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: extract hash for stand-alone use
From: |
Peng Yu |
Subject: |
Re: extract hash for stand-alone use |
Date: |
Thu, 20 May 2021 10:37:00 -0500 |
On 5/20/21, david kerns <david.t.kerns@gmail.com> wrote:
> this is a bit off topic for this mailing list... but here:
>
> $ cat ah.c
> #include <stdio.h>
> #include <string.h>
>
> static unsigned long
> awk_hash(const char *s, size_t len, unsigned long hsize, size_t *code)
> {
> unsigned long h = 0;
> unsigned long htmp;
>
> /*
> * Ozan Yigit's original sdbm hash, copied from Margo Seltzers
> * db package.
> *
> * This is INCREDIBLY ugly, but fast. We break the string up into
> * 8 byte units. On the first time through the loop we get the
> * "leftover bytes" (strlen % 8). On every other iteration, we
> * perform 8 HASHC's so we handle all 8 bytes. Essentially, this
> * saves us 7 cmp & branch instructions. If this routine is
> * heavily used enough, it's worth the ugly coding.
> */
>
> /*
> * Even more speed:
> * #define HASHC h = *s++ + 65599 * h
> * Because 65599 = pow(2, 6) + pow(2, 16) - 1 we multiply by shifts
> *
> * 4/2011: Force the results to 32 bits, to get the same
> * result on both 32- and 64-bit systems. This may be a
> * bad idea.
> */
> #define HASHC htmp = (h << 6); \
> h = *s++ + htmp + (htmp << 10) - h ; \
> htmp &= 0xFFFFFFFF; \
> h &= 0xFFFFFFFF
>
> h = 0;
>
> /* "Duff's Device" */
> if (len > 0) {
> size_t loop = (len + 8 - 1) >> 3;
>
> switch (len & (8 - 1)) {
> case 0:
> do { /* All fall throughs */
> HASHC;
> case 7: HASHC;
> case 6: HASHC;
> case 5: HASHC;
> case 4: HASHC;
> case 3: HASHC;
> case 2: HASHC;
> case 1: HASHC;
> } while (--loop);
> }
> }
>
> if (code != NULL)
> *code = h;
>
> if (h >= hsize)
> h %= hsize;
> return h;
> }
>
> int main(int argc, char **argv)
> {
> printf("%08x\n", awk_hash(argv[1], strlen(argv[1]), 0xffffffff, NULL));
> return 0;
> }
> $ make ah
> cc ah.c -o ah
> $ ./ah "this is a test"
> 62a9aba5
>
> On Thu, May 20, 2021 at 6:37 AM Peng Yu <pengyu.ut@gmail.com> wrote:
>
>> > assuming you mean the hash for associative arrays...
>>
>> Yes.
I misunderstand what you meant. I meant the hash table implementation
for associative arrays instead of the hash function for associative
arrays.
Do you have a standalone example for the hash table (with insert,
delete, get, length operations)? Sorry for the confusion.
>> > look in str_array.c for awk_hash
>>
>> I see these functions defined in str_array.c. What functions are
>> relevant? (I assume env_* are not relevant.)
>>
>> Is there any minimal working example on how to use the hash related
>> functions?
>>
>> $ grep str_array tags | awk -v FS='\t' -e '$4 == "f"'
>> awk_hash str_array.c /^awk_hash(const char *s, size_t len,
>> unsigned
>> long hsize, size_t *code)$/;" f typeref:typename:unsigned long
>> file:
>> env_clear str_array.c /^env_clear(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>> env_remove str_array.c /^env_remove(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>> env_store str_array.c /^env_store(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>> fnv1a_hash_string str_array.c /^fnv1a_hash_string(const char
>> *str,
>> size_t len, unsigned long hsize, size_t
>> *code)$/;" f typeref:typename:unsigned long file:
>> grow_table str_array.c /^grow_table(NODE
>> *symbol)$/;" f typeref:typename:void file:
>> gst_hash_string str_array.c /^gst_hash_string(const char *str, size_t
>> len, unsigned long hsize, size_t
>> *code)$/;" f typeref:typename:unsigned long file:
>> init_env_array str_array.c /^init_env_array(NODE
>> *env_node)$/;" f typeref:typename:void
>> scramble str_array.c /^scramble(unsigned long
>> x)$/;" f typeref:typename:unsigned long file:
>> str_array_init str_array.c /^str_array_init(NODE *symbol
>> ATTRIBUTE_UNUSED, NODE *subs
>> ATTRIBUTE_UNUSED)$/;" f typeref:typename:NODE ** file:
>> str_clear str_array.c /^str_clear(NODE *symbol, NODE *subs
>> ATTRIBUTE_UNUSED)$/;" f typeref:typename:NODE ** file:
>> str_copy str_array.c /^str_copy(NODE *symbol, NODE
>> *newsymb)$/;" f typeref:typename:NODE ** file:
>> str_dump str_array.c /^str_dump(NODE *symbol, NODE
>> *ndump)$/;" f typeref:typename:NODE ** file:
>> str_exists str_array.c /^str_exists(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>> str_find str_array.c /^str_find(NODE *symbol, NODE *s1, size_t
>> code1,
>> unsigned long hash1)$/;" f typeref:typename:NODE **
>> file:
>> str_kilobytes str_array.c /^str_kilobytes(NODE
>> *symbol)$/;" f typeref:typename:AWKNUM
>> str_list str_array.c /^str_list(NODE *symbol, NODE
>> *t)$/;" f typeref:typename:NODE ** file:
>> str_lookup str_array.c /^str_lookup(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>> str_remove str_array.c /^str_remove(NODE *symbol, NODE
>> *subs)$/;" f typeref:typename:NODE ** file:
>>
>> --
>> Regards,
>> Peng
>>
>
--
Regards,
Peng