[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] bounds checking with tcc

From: Herman ten Brugge
Subject: Re: [Tinycc-devel] bounds checking with tcc
Date: Tue, 26 Nov 2019 08:44:08 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

On 2019-10-31 15:11, Michael Matz wrote:

On Tue, 29 Oct 2019, Herman ten Brugge via Tinycc-devel wrote:

I tried bound checking with tcc and found some problems. (See attached
The bound checking code is fairly incomplete, and some of it can't really
be fixed without more aggressive changes in TCC, my private opinion is
that we shouldn't really bother.  But some remarks on your changes:

First the bounds checking code is included in shared objects and in the
You now made it so that libtcc1 is only included in the executables, never
in shared libs.  But that's wrong as well: e.g. if TCC generates a call to
__fixsfdi in a DLL the code for that wouldn't be included with the DLL
code anymore.  So in order to use that DLL you would now have to link
the application with TCCs libtcc1, and that only works by accident because
symbols from executables that are referred to from shared libs are
exported.  E.g. dlopening such shared lib wouldn't work anymore.

There are some possibilities to fix the problem correctly:
1) make the global libtcc1 symbols hidden
2) make the bound checking code (and only it) sit in a shared lib (in
    comparison to the current static lib), which is added to the requires
    of shared libs and executables

The former still means separate copies of the bounds checking code (and
hence separate data structures per DSO), but results in more
self-contained exectuables/libs.  The malloc hooks are still redirected
twice, so the code in bcheck.c simply would need to be prepared for this.

The second solution implies only one copy of the bounds infrastructure,
which might make some things easier, but leads to non-self-contained
executables/shared libs.

This means that for example malloc and friends are redirected twice.
I fixed this in tccelf.c
Agreed. I won't fix this because bounds checking needs more changes.
Second problem is that only arrays are checked.
But there are a lot of other objects like structs with arrays or variables
with address taken that should also work.
Fixed this in tccgen.c
But does this actually work?  Either the comment is obsolete, or it's not
that easy:

         /* XXX: currently, since we do only one pass, we cannot track
            '&' operators, so we add only arrays */
         if (bcheck && (type->t & VT_ARRAY)) {

The situation the comment alludes to is the following:

   void foo (void)
     int locali;    // 0
     locali = 42;   // 1
     bar (&locali); // 2
     bla (locali);  // 3

As TCC is single pass, at the declaration of locali you can't know that
eventually the address is taken.  So, as in your patch, you have to always
generate code assuming that the address will be eventually taken, i.e. for
all locals.  That's fairly wasteful in case the address is then not
actually taken.
Agreed. But because tcc is one pass. There is no other solution I can think off.
And yes this creates a lot of overhead.

The last one is some problems with code generation on x86_64.
The difficult one was the VT_LLOCAL code.
Fixed in x86_64-gen.c
@@ -645,7 +645,11 @@ static void gen_static_call(int v)
      Sym *sym = external_global_sym(v, &func_old_type);
      oad(0xe8, 0);
      greloca(cur_text_section, sym, ind-4, R_X86_64_PC32, -4);
+    greloca(cur_text_section, sym, ind-4, R_X86_64_PLT32, -4);

The function is named 'gen_static_call' and for that _PC32 really is the
correct relocation (well, actually it would be local calls, not only
static ones, of course).  The problem is that your change to not link
libtcc1.a into shared libs makes the calls to __bound_foo not be
static/local anymore.  So, the function would need to be renamed at least,
but see above why I think this ultimately is not a good idea.
-- a.c --
extern void *memset(void *s, int c, unsigned long n);
void tst(void) { char a[10]; memset (a, 0, 10); }
-- b.c --
extern void tst(void);
int main(void) { tst(); return 0; }
Compile with:
tcc -b -shared -o liba.so a.c
tcc -b b.c -o b liba.so
And then run with:
I get:
./b: Symbol `__bound_init' causes overflow in R_X86_64_PC32 relocation
./b: Symbol `__bound_local_new' causes overflow in R_X86_64_PC32 relocation
./b: Symbol `__bound_local_delete' causes overflow in R_X86_64_PC32 relocation

This is fixed by this patch.

Then the change for VT_LLOCAL:

              if ((r & VT_VALMASK) == VT_LLOCAL) {
                  SValue v1;
-                r = get_reg(RC_INT);
+                r = get_reg(RC_FLOAT);
                  v1.type.t = VT_PTR;
                  v1.r = VT_LOCAL | VT_LVAL;
                  v1.c.i = fc;
                  load(r, &v1);
                  fc = 0;
+                vtop->r = r = r | VT_LVAL;

Note quite: the register you use for loading the VT_LLOCAL slot must be
RC_INT (it's simply a pointer pointing to the ultimate lvalue).  I can see
why the addition of LVAL might be necessary sometimes, would be nice to
have a testcase.
See attached file. If I compile this with tcc -b. I get:
bcheck.c __bound_ptr_indir4: 0x7ffc6ed279b8 is outside of the region
This is fixed with the tccgen.c patch. See above.
After this the result printed is:
60 2
This should be
15 2
This is fixed by this patch. I Agree that the RC_INT -> RC_FLOAT is wrong.

Hope this explains things.


I will stop now because bounds checking looks not to work for complex

Perhaps we could add the above code to git? The first and last one are
fixing problems. For the last problem I have a testcase that reproduces
the VT_LLOCAL problem.
So, I'm against the libtcc1.a linking change; it might benefit bounds
checking right now, but worsens the situation for everything else.  The
VT_LLOCAL problem indeed should be fixed (but see above), especially if
you have a testcase.  The change to gen_static_call should be unnecessary
with the libtcc1 change.  To bounds-check all variables, not just arrays,
seems to be a fairly aggressive change, I think it's not really
appropriate at this time.


Attachment: f.c
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]