[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nano-devel] WIP: Use caching for regex

From: Devin Hussey
Subject: Re: [Nano-devel] WIP: Use caching for regex
Date: Fri, 2 Nov 2018 23:39:30 -0400

Here is my patch, I am not going to duplicate the details.

Anyways, gperf generates perfect hashes from the strings you give it.
Its syntax is similar to yacc/yylex.

For a real example, instead of this O(n^2) strcmp tree:
    if (!strcasecmp(input, "cancel"))
        s->func = do_cancel;
    else if (!strcasecmp(input, "exit"))
        s->func = do_exit;
    else {
        s->func = do_toggle_void;
        if (!strcasecmp(input, "nohelp"))
            s->toggle = NO_HELP;
        else if (!strcasecmp(input, "smoothscroll"))
            s->toggle = SMOOTH_SCROLL;

we make this gperf file
    #include "proto.h"
    #include <string.h>
    enum sc_type { SC_TYPE_SHORTCUT, SC_TYPE_TOGGLE };
    struct scfunc_data {
        const char *data;
        enum sc_type type;
        union {
            void (*func)(void);
            int32_t toggle;
    # take advantage of how gperf puts things in verbatim
    cancel, SC_TYPE_SHORTCUT, .func = do_func_cancel
    exit, SC_TYPE_SHORTCUT, .func = do_func_exit
    nohelp, SC_TYPE_TOGGLE, .toggle = NO_HELP
    smoothscroll, SC_TYPE_TOGGLE, .toggle = SMOOTH_SCROLL
    # make sure there are no extra empty lines at the end, and don't
indent it like I did

Run gperf on it and it spits out some magic C code which I am not
going to post because it is long, but it makes the function

    const struct scfunc_data *in_word_set(const char *str, unsigned int len);

which converts the keys and values to a struct, quoting before the
first comma and putting things verbatim after it like so:
    { "cancel", SC_TYPE_SHORTCUT, .func = do_func_cancel }, {
"nohelp", SC_TYPE_TOGGLE, .toggle = NO_HELP }...

Now, in your code, do this:

sc *strtosc(const char *input)
    const struct scfunc_data *data = in_word_set(input, strlen(input));
    sc *s;
    if (data == NULL)
        return NULL;
    s = (sc *)nmalloc(sizeof(sc));
    if (sc->type == SC_TYPE_SHORTCUT) {
        s->func = data->func;
        s->toggle = 0;
    } else {
        s->func = do_toggle_void;
        s->toggle = data->toggle;
    return s;

And your unwieldy 262 line function just became 16.

gperf has very low complexity, because it knows what strings are going
to be hashed.
It does this:
1. Take the first and last few characters, as well as the length and
get a value from a look up table,
2. Check the length to another lookup table.
3. At this point, it has a unique value (it calculated it out
beforehand) and can just validate it with a single strcmp.
4. Return the entry in the table of structs, or null.

The only issue here is that there is no easy way to conditionally
enable the items AFAIK. cpp puts a blank line on replaced blocks, and
that doesn't play nice.

A side note: I think this could clean up convert_sequence as well:

O1;2A, KEY_UP, .shift = TRUE
O1;2B, KEY_DOWN, .shift = TRUE
O1;2C, KEY_RIGHT, .shift = TRUE
O1;2D, KEY_LEFT .shift = TRUE
O1;2P, KEY_F(13)

But, alas, that is for another patch.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]