[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] struct bug: identical named struct members

From: Michael Matz
Subject: Re: [Tinycc-devel] struct bug: identical named struct members
Date: Sun, 29 Nov 2020 00:11:17 +0100 (CET)
User-agent: Alpine 2.21 (LSU 202 2017-01-01)


On Sat, 28 Nov 2020, Tyge Løvset wrote:

Yes I started looking in struct_decl. Is there a reason why it doesn't use
an efficient unordered map for lookup, other than the extra code weight?

Mostly because of the T in tiny c compiler and because no profiling shows field lookup to be a problem :)

If that (and the cstr string type) is stripped down to its bare minimum, it
would be perfect for general symbol tables.

It's not general symbol tables in this case, but fairly specific: the set must be ordered (for struct layout), at least at some point; the set is looked up by small integers (aka identifier name); the set tends to be small.

The only other fast C map I know
of is khash (https://attractivechaos.github.io/klib), however not memory
efficient, and the codebase is somewhat bigger.

I guess there as many map implementations as there are C developers :-)

But, looking at tccgen.c, it may be too ambitious to integrate?

I personally would not integrate a full generally capable hashmap without measurements on realistic sources (i.e. not sources that artificially use structs with 1000 members and 10.000 accesses to the last member :) ). It's simply such that the number of struct members in C sources tends to be a dozen max, on average, where a linked list is fairly okay. (This implies that the quadratic checking of duplicates at struct decl time might be completely acceptable, eventually it will be overshadowed by normal member lookups)

But do try, if the implementation turns out to not add memory overhead and many source lines, and is generally in the spirit of TCC, why not :) (what's the spirit? I don't know, you'll eventually get a feeling for it.)

(you will probably see in the course of such experiment that various things aren't that straight forward to add, e.g. currently all parser structures are Syms, and they are generically freed per scope no matter if they are types, symbols, cleanups, or anything else; you would have to free the hash tables somewhere, which would exist only for struct types, which would mean at least different handling for these and the other Syms, i.e. you'll probably see that it reduces elegance somewhat)

ps: I haven't really looked much at the core code yet;

Keep reading then, it's a quirky, dense, capable and satisfying source base :)


I do have some
compiler tech experience way back from creating an external syntax checker
for www.autoitscript.com, using flex and yacc.


On Sat, 28 Nov 2020 at 00:32, Michael Matz <matz.tcc@frakked.de> wrote:

      On Fri, 27 Nov 2020, Tyge Løvset wrote:

      > Is this a known bug, or regression? 

      Known bug.

      > I tried to follow the code in parse_btype() in tccgen.c for
      the missing
      > struct member symbol lookup check, but didn't succeed so far:
      >       } else {
      >             c = 0;
      >             flexible = 0;
      >             while (tok != '}') {
      >                 if (!parse_btype(&btype, &ad1)) {
      >     skip(';');
      >     continue;
      > }

      Member lookup is linear, so checking for duplicates is
      quadratic, so TCC
      doesn't bother to do it.  The check would belong to struct_decl,
      parse_btype, probably involving find_field before adding it.

      Tinycc-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]