Re: [Tinycc-devel] Hello and a few questions about using libtcc

tinycc-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] Hello and a few questions about using libtcc

From:	Michael Matz
Subject:	Re: [Tinycc-devel] Hello and a few questions about using libtcc
Date:	Fri, 25 Dec 2020 01:36:34 +0100 (CET)
User-agent:	Alpine 2.21 (LSU 202 2017-01-01)

Hello,

On Sun, 20 Dec 2020, Joshua Scholar wrote:

4) If I did something weird like have a call out from generated code to

my

code, and my code returned on the same stack but in the context of a
different thread than it entered from, would that break anything?


No.  Or, perhaps better said, it would break in the same way when the
generated code would also be your code and not generated by TCC, i.e. TCC
doesn't introduce additional restrictions.  In particular the usual
makecontext/swapcontext way of implementing lightweight threads via stack
switching should work just fine, as should any more unusual way of
switching threads but not stack (what is that even supposed to mean?), as
long as the input code doesn't have any problem if it had been written
literally without TCC involvement.

"switching threads but not stack (what is that even supposed to mean?)"


What it means is this, imagine that a scheduler outside the generated code
created a new stack, switched to it then called some generated code.
And at some point that generated code rather than returning to the
scheduler yielded by making a call into my own code which, instead of
returning, just saved the context/continuation somewhere and switched back
to its native stack.

Then later on, a scheduler running on a different thread with a different
thread local state, for instance, takes that stack and switches to it and
returns into the generated code, appearing to
return from the call, but in a different thread.

Yeah, so the classic makecontext stack switching/co-routines. The abovedoes switch stacks, which is why I was confused by your saying of notdoing that. Just mis-understanding.

Any code that used thread local memory might break.

Yeah, and no, TCC is not doing any of that; the code it generates usesexactly the features that the input code uses as if it were compiled by anormal ahead-of-time C compiler. So if your to-be-compiled sourcesnippets are free of effects breaking the above use case then the codegenerated by TCC is free of them as well.

I assume that tcc_compile_string is equivalent to tcc_add_file.  Does that
mean that you can add multiple strings to be compiled?  Does tcc copy them,
or does it compile them immediately and forget them, or do the original
buffers have to be retained?

The string buffer doesn't have to be retained. TCC generates machine codeinto the appropriate buffers, binds them to the host programs (or otherparts of such machine code also added by tcc_compile_string) and that'sit.

Kyryl's answers brought up a bunch more questions for me.
For instance, he said "tcc_delete will free everything, if you call itbefore running generated code it will crash."
So I wonder,
1) when is the run time library loaded, when is it initialized and is it
ever freed or finalized?


Which run time library?  In a code snippet like
  "int foo(int a, int b) { return a + b; }"

there's nothing else involved than the assembly code containing basicallysome moves and an add and return instruction. No runtime libraryinvolved. There's a bit of runtime lib code in libtcc1.a which implementssupport for some things the compiler relies on that is better expressedwith extra routines: some double-long arithmetics and va_list support,alloca, and the helpers for bounds-checking. The code for that lies inlibtcc1.a and is linked into the TCCState via tcc_add_runtime, called bytcc_relocate_ex in some cases. This linking also places the resultingcode in the provided buffers (or somewhere into TCCState). It's deletedwith tcc_delete, like the code from compile_string.

If a jit made a different TCCState for each routine it compiles, say 1000
routines,

a) would tcclib load 1000 copies of the runtime library?

Yes. Basically a TCCState is a complete sandbox separate from each other(not in a safety sense, but in design, if you have wild writes in onTCCState it might affect memory that happens to be for another TCCState),only communicating with the host executable (e.g. to provide a 'bar'routine for this snippet: "void callbar(void) { bar(); }"). But libtcc1.ais only loaded as necessary, so if your snippets don't use va_list andalloca (and you don't use bounds checking) then you don't need any of iton x86-64 for usual code (i.e. not one involving 128 bit arithmetics).

b) would it make a static variable section for the runtime library 1000
times and initialize the variables in it?


Yes.

c) would it make a different heap for each routine?  If I call malloc in
one routine and then free it in another routine would it try to free it
into a different heap and corrupt a heap?

malloc and free aren't provided by TCC, they are provided by your hostprogram (or rather by the supporting C library linked into it), thesnippets, when calling malloc will use those routines. So, if those arethread-safe (they are in all but the most basic C systems) then all issafe. All TCCStates (and the routines therein) will use the same heap.

d) if the jit wanted to update a routine with different code, so it called
tcc_delete on that code's state, then made a new state and compiled new
code for it, would that break anything?  Would tcc_delete finalize
anything?  Delete a heap?

Nope. Everything allocated (e.g. via malloc) by code snippets staysallocated after tcc_delete (probably then causing a leak as you also looseall data information, like where that malloced block was). So you wouldneed to make sure that all resources are freed in the snippets beforecalling tcc_delete, e.g. by providing a finalizer routine yourself (andcalling it!).

About changes to an existing TCCState: it's quite probably that thiscurrently doesn't work, I honestly don't know. In particular suchsequence:


   TCCState *s = tcc_new();
   tcc_compile_string(s, "...");
   tcc_relocate(s, ... buf ...)
   foo = tcc_get_symbol("foo");
   // do something with foo, up to here everything is clear
   // now comes the parts where I'm sure are bugs/unsupportedness:
   tcc_compile_string(s, " ... something else ..."); // uhh, changing a 
relocated state?
   tcc_relocate(s, ....); // finalize only new stuff ???
   foo2 = tcc_get_symbol("foo2");
   // do something with foo2
   ...
   tcc_delete(s);

Even if the above happens to work (i.e. two relocate calls, where thesecond would affect only the added snippets, or at least don't destroythe old snippets), which I doubt, then you certainly will run intoproblems when the second snippet tries to override symbol "foo" (from thefirst snippet) and expects that even old snippets calling old foo willthen call new foo.

ie., is tcclib really designed to be used the way a jit would use it?

Depends. If you expect that the jit can regularly replace existingfunctions with new versions transparently, then no. TCC could be extendedto do so, but it's not there as of now. At the very least there needs tobe some way of unrolling the process of relocation and some symbol tabletrickery for the symbol replacements.

He also said that it's possible to reuse a TCCState but "Yes you can reuse
it, but it's not going to be efficient because it
will have to recompile the previous stuff also."

Well, to get around the above problem of not being able to replacefunctions transparently you could use a scheme where you remember allstrings fed into tcc_compile_string, with some meta-info (like this stringwas for function "foo", and order of addition). Then, in order to replacefunction "foo" you would replace that string, and then feed all thecollected string (all old ones, except the old-foo string, now replacedwith new-foo) into a new TCCState. You end up with a state equivalent tothe old one, except for the replaced foo function. Of course theaddresses of stuff is all different, so you would have to refetch e.g. allsymbols that you were fetching from the old state.

TCC is extremely fast in generating code, but the above process,especially if there's much non-changing code might be too slow in the end.You would have to try.

2 What state is remembered from one tcc_relocate to the next?
a) does it remember that I
called tcc_set_output_type,  tcc_add_library_path, tcc_add_symbol,
tcc_add_file,
or tcc_compile_string?  If it remembers tcc_compile_string, did it save a
copy of the string?

As said above, calling tcc_relocate twice on the same state is probablynot going to work right now, so the correct answer would be "mu". But theinfo of output_type and added libraries would still be there. The "info"(i.e. code/data/addresses) from added libs, files and symbols would stillbe there, but in a relocated/finalized fashion, in such way thatrelocating/finalizing it again would mangle it. And no, compile_stringdoesn't remember it's argument anywhere.

3 And since he said I'd have to recompile the previous things, what
happened to the previously compiled code?


Also mu.

It sure would be cool if it turns out that libtcc loaded and initializedits runtime library before any TCCState is made, and that runtimelibrary is shared between all states. That would be what people whomake jits want. But I'm still worried by the answer that you can't runcode after a TCCState is deleted.

The runtime lib is really the smallest thing to worry about. What'sprobably the bigger issue for you right now would be that a TCCState canbe finalized only once safely.

It's actually probably not too much work to make TCCState into one whereyou can repeatedly call compile_string and relocate in a mixed way (andeven override functions/data), it's just that noone invested the time todo that.



Ciao,
Michael.

[Prev in Thread]

Current Thread

[Next in Thread]

[Tinycc-devel] Hello and a few questions about using libtcc, Joshua Scholar, 2020/12/20
- Re: [Tinycc-devel] Hello and a few questions about using libtcc, Kyryl Melekhin, 2020/12/20
- Re: [Tinycc-devel] Hello and a few questions about using libtcc, Michael Matz, 2020/12/20
  - Re: [Tinycc-devel] Hello and a few questions about using libtcc, Joshua Scholar, 2020/12/20
    - Re: [Tinycc-devel] Hello and a few questions about using libtcc, Michael Matz <=

Prev by Date: Re: [Tinycc-devel] Almost added a feature, but I broke things
Next by Date: Re: [Tinycc-devel] Almost added a feature, but I broke things
Previous by thread: Re: [Tinycc-devel] Hello and a few questions about using libtcc
Next by thread: [Tinycc-devel] handleapi.h not found
Index(es):
- Date
- Thread