[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] make tcc reentrant

From: ag
Subject: Re: [Tinycc-devel] make tcc reentrant
Date: Wed, 11 Dec 2019 03:53:54 +0200
User-agent: Mutt/1.12.1 (2019-06-15)

On Mon, Dec 09, at 03:33 Michael Matz wrote:
> Hi,
> On Sat, 7 Dec 2019, Christian Jullien wrote:
> > > 2) slower code: most of the time the indirection through a pointer 
> >    variable (the state) in comparison to a direct access to a static 
> >    variable doesn't matter. 
> > 
> > In fact, I experimented the opposite. When moving all global variables 
> > to a struct, my Lisp was around 1% faster because globals are now close 
> > together and more often accessible from L1 cache.
> They are equally close in the .data section/segment, as long as you put 
> _all_ global data into that struct (which is what was proposed).  I do 
> believe you that it was faster, merely pointing out that it probably had 
> other reasons.
> Either way, I measured with TCC itself, not some other program with 
> completely different behaviour.  Specifically the accesses to the token 
> hash table from the pre-processor (tokenization is the slowest part in 
> TCC, as it should be) currently needs only one register, because the 
> address of that table is an immediate.  Just changing this to be a pointer 
> (cached in a register), with otherwise similar code made the inner loop of 
> tokenization measurably slower (though I don't remember the percentage 
> anymore, but I was thinking "meeh").  The effect is that on x86-64 the 
> computation either needs a slow addressing mode, or multiple instructions 
> (which are dependend then), which alone caused this slowdown.
> Of course, we could decide that a (say) 2% slowdown in compilation speed 
> is acceptable for the feature of multiple compiler states that don't have 
> to interact.
> Or we could think long and hard about what we really want/need from our 
> APIs and try to get both.  E.g. it's not clear that the token hash table 
> really needs to become a non-singleton, even _if_ we'd allow multiple 
> TCCState objects.  It's just that the update to the token hash must not be 
> done concurrently.
> So, we could for instance allow and support multiple TCCStates but 
> _without_ multi-threading.  Or (like grischka said) allow multi-threading, 
> but only on the API level (and serialize internally).
> So, I think the latter is the solution with the most bang for the buck: 
> allow multiple TCCStates, but don't necessarily move all global data into 
> the state, under the assumption that at its core TCC remains 
> single-threaded.
> We'd also still have to decide what multiple TCCStates really mean: e.g. 
> I'd say this is only for compiling to memory.  That would mean that it's 
> not really necessary for the code from different states to be generated 
> into different sections.  Though finalization needs to be per state, so 
> maybe it's unavoidable, but that needs to be tried.  There might be other 
> global data (e.g. the memory allocator) that also could remain shared 
> between different states.  (Again, all with the assumption that 
> multi-threading is serialized high in the API level).

And at the end is always about mechanics and mechanism[s] (funny).

First is the tcc underlying machine. Should be fast. No compromises. Direct
access to the underlying physical machine. This is C and its property. We take
some risks we want results. Optimum results. Otherwise goto the abstraction.
C has almost no abstraction (probably is more like a tiny bits of convenience).

With multi-threading though, the underlying machine tcc code, should account for
side effects of the shared execution space, so adds even a tiny bit (and with 
best of efforts) of complexity. So that is not quite primitive and it seems like
a condradiction with the "direct access to the machine" thing. It's even is a 
of odd, that tcc which is a C compiler, to use "second thoughts" in its C code.

But as (almost) always staring at the code reveals the answer. I never done 
i saw some and always look (oh well) complicated! (for the C way of thinking), 
as it
is perhaps the only code that scanning stops and the mind should start to think 
in an
unusual way.

State though can encapsulate all what they need to make them work without 
And since it is their data and their methods, and if at some point they 
that data, perhaps they can feel more certain and could apply some optimizations
(e.g., uneeded conditional branches) and even gain some speed at the end.

But states could not let them without a supervisor or (anyway) is wise to 
one. As Michael said, there is a need to think hard here to avoid complexity and
without loosing speed.  Compilation speed is all what matters to tcc (foremost) 
as a
development tool. Tcc compiles more than 15000 lines library code in a fraction 
of a
second, when gcc takes 19 and clang 24. When you only want to inspect some 
this can speed a whola lot of time the development process, as sometimes this 
gets really intensive and it might happens quite of times in the row. In 
(almost 2
years now experience with tcc) tcc produces identical results with gcc with -O2 
-Wall -Werror, while gcc and clang disagree sometimes.

What this supervisor or the root state of those states will do is discussable.
It can perhaps impose some strictness of the state objects visibility, or it can
pass messages between them, or perhaps it could be used as a wise try/catch 
or it can handle communication outside of its parent group pid, or it can even 
used with the same way with a higher level language (it can be set for 
instance, an
intermediate function)). But here we are starting to talking like tcc is a 
language, but why not? If we could use a very strict subset of C (a C--) we 
could cut
roads and be faster. Perhaps we could have a way to instruct tcc, that only for 
will make sence, even for a second pass! For instance to optimize a block or a 
where performance matters. Right now no code has special treatment. Perhaps it 
be wise to have.

The above make sence with the regards to flexibility and mainly about the user 
to do whatever she likes todo without to have to pay a lot o price (like 
multiply tcc
instances: but this seems like quite of complex task, and besides that is 
(and we all know that economy matters here!)).

In any case the objects should be (naturally) opaque pointers.

How this can be implemented? My favorite way it to pass my objects or states as 
first argument of the function, and having a root linked list (usually double) 
a head and a tail and with a current pointer, to control them; plus an index 
and the number of the states. But in this case manifested, that pointer 
adds (even) a small of an overhead so this might be out of the discussion, but 
is quite technical and its far away the time (they say you need at least 20 
years of
experience in C), to know the prudent way (though i saw/see people with more 
than 40
years of experience (both) to disagree sometimes about the technical details).

But as always there always afterthoughts but discussion surely helps, (damn too 
always, it's getting absolutism).

> Ciao,
> Michael.


> > It has of course no effect when global is a pointer which introduces the 
> > same indirection. It is true for aggressive optimizers which are likely 
> > to put struct pointer to a register. So it may be faster for tcc 
> > compiled by gcc, clang or vc++ but slower when tcc is compiled by tcc.
> > 
> > C.
> > 
> > -----Original Message-----
> > From: Tinycc-devel [mailto:tinycc-devel-bounces+eligis=address@hidden] On 
> > Behalf Of Michael Matz
> > Sent: Friday, December 06, 2019 16:42
> > To: TCC mailing list
> > Subject: Re: [Tinycc-devel] make tcc reentrant
> > 
> > Hello,
> > 
> > On Tue, 3 Dec 2019, Ulrich Schmidt wrote:
> > 
> > > i try to write a lua binding for tcc. To work out propperly, the tcc lib
> > > needs to be reentrant.
> > 
> > As demonstrated down-thread, that isn't correct.  It doesn't _need_ to be, 
> > it would be an feature.  As usual with features it needs to be measured 
> > against the downsides.  The downsides for your proposed changes are the 
> > following at least:
> > 1) more complicated/boiler-platy source code of TCC (a TCCState
> >    argument almost everywhere)
> > 2) slower code: most of the time the indirection through a pointer 
> >    variable (the state) in comparison to a direct access to a static 
> >    variable doesn't matter.  But it does matter for the symbol/token 
> >    table (and potentially for the register/evaluation stack).  I have 
> >    measured this years ago for the token table, so this might or might not 
> >    still be the case.
> > 
> > So, while I can see the wish for this feature, I don't necessarily see 
> > that tcc should be changed to accomodate.
> > 
> > If anything I would expect a _complete_ transition to exist, in order to 
> > measure the impact.  The worst thing that could happen is if someone added 
> > TCCState arguments everywhere, moved some static variables to that state, 
> > and then leaves: none of the features of this whole excercise would be 
> > had, but all the downsides would be there.
> > 
> > And yes, this is a big project.  I really think it would be better
> > if you simply write a wrapper for libtcc that ensures single-threadedness 
> > and that regards TCCState as a singleton.  I think such thing would be 
> > well-suited in the TCC sources itself.
> > 
> > (In a way it seems prudent for a tiny C compiler to only be usable as a 
> > singleton)
> > 
> > 
> > Ciao,
> > Michael.
> > 
> > > 
> > > I took a look into the sources and found some comments (XXX:...) and
> > > started with removing
> > > 
> > > the static var tcc_state. As a result allmost all lib functions needs a
> > > 1st parameter of
> > > 
> > > type TCCState*. I did this in my own local branch and tcc is still
> > > running :).
> > > 
> > > But this is a really HUGE change. in addition most of the local vars in
> > > tccpp, tccgen, ... needs
> > > 
> > > to be moved to TCCState. I can do that but at some points i will have
> > > some questions and i
> > > 
> > > can only test on windows and probably on linux.
> > > 
> > > My 1st question is: Are you interested in these changes or should i do
> > > this locally?
> > > 
> > > I would like to this together with you.
> > > 
> > > 
> > > Greetings.
> > > 
> > > Ulrich.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]