lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lightning] Lightning frees NULL pointers, and frees others it didn´


From: Paulo César Pereira de Andrade
Subject: Re: [Lightning] Lightning frees NULL pointers, and frees others it didn´t allocate
Date: Tue, 29 Jul 2014 17:56:50 -0300

2014-07-29 15:25 GMT-03:00 Ian Grant <address@hidden>:
> Hi Paulo,

  Hi Ian,

> Thanks for your attention. That helps a lot. But it would be nice if
> lightning could just "do the right thing". There is a fairly clear notion of
> a JIT compilation state, and these states should be completely independent.

  Currently only jit_set_memory_functions should affect different jit
states. Other than that, one only needs to know exactly what the _jit
identifier points to, as the jit_xyz(a,b,c) calls are just macros expanding
to _jit_xyz(_jit,a,b,c).

> So actions on one state should affect that state only - the other states may
> be in different threads. Any interconnections between the states complicates
> the semantics, and should be avoided. For example, if the user decides to
> change the memory management functions, these changes should affect only
> subsequent states that are created. That is what one expects, in the absence

  jit_set_memory_functions should only be called in very special cases, and
the main reason of adding it is to allow plugging it to some complex memory
manager, for example, one that mmap's buffers and creates its own linked
lists. One should never use it in a library (I saw too much problems with
several different libraries/modules calling gmp_set_memory_functions
and then having all kinds of memory problems).

> of any indication otherwise. And any such exceptions to the rule should have
> semantic reasons: there are too many aspects of the lightning interface
> which have no semantic justification, they are only accidents of the
> implementation, That's not good design.

  I tried to keep the interface with as few changes as possible when
becoming the maintainer. Major real change was to switch to left to
right arguments for function calls. Otherwise I understand it has several
design issues, and I added a some when addressing others that I
considered more serious...

> Another example of where the implementation is driving the semantics is the
> function call setup. There is no semantic reason why one cannot compose
> function application, in the sense that one calls other functions in the
> process of stacking up the arguments to a function which will operate on the
> results they return. So one should be able to nest calls to prepare/finish.

  Currently it requires the code generator to be smart enough to save
temporaries in the stack and make a single/simple call. I have been
considering for some time to make some kind of library interfaces on top
of lightning that would handle it (more about it below).

> It would also be nice of lightning could be more intelligent about its heap
> use. At the moment it allocates over 50KB to compile a 50 byte function.
> That is 1,000 times more memory than it strictly needs. We need to refine
> this. The allocation block size should be configurable on a state-by-state
> basis. So if the user expects to compile 50k blocks of code, then they can
> set the block-size to 4KB, but if they are compiling small routines they can
> set it to 256 bytes. Another aspect of JIT that the library doesn't address
> is code size.  I want to be able to decide for a given loop whether to
> unroll it. I want to unroll it to fit one cache line. So I need to know the
> exact size of the generated code for the body of the loop and the increment
> operation, _before_ I commit to emitting the code.

  The exact code size is complicated to calculate. It could be done by
replicating it in another state, and generating code to see how large it is,
but that could change in another state depending on register pressure
and if it needs to spill/reload some register.
  There is a not exported interface that estimates the code size using
tables created on "special builds", see for example
http://git.savannah.gnu.org/cgit/lightning.git/tree/lib/jit_x86-sz.c
An interface could be added, but it would not be able to tell with
100% precision how much bytes the could would use in all
stages.

> I think what would help is using a higher-level language. C is not suited to
> the kinds of list processing that you need to do to get this right. A few
> years ago I would have said look at GNU guile, which can be dynamically
> linked in to applications as a library. But guile is now so fat and sluggish
> it's unusable. I can't even compile it - it crashes byte-code compiling it's
> modules! PLT Racket is a much better scheme implementation  Another
> possibility is to make more use of glib functions. This would give you a
> nice slab allocator, and a lot of portable system functions as well as the
> basic structures like hash tables, doubly-linked lists etc. Glib is very
> well written and fairly well-designed. It's definitely worth studying.

  Currently other than optionally linking to binutils libraries, it is fully
self contained, what is good; earlier it did use gmp for bitmaps, but
since the usage was very small, I wrote a few simple bitmap routines.

> I am working on Moscow ML (see http://livelogic.blogspot.com/) and have got
> a lightning interface, but Moscow ML doesn't currently link with another
> application as a library: it compiles only stand-alone programs with the
> CAML run-time. This will change, but I need lightning to do it!

  Initially I made what become GNU lightning 2.0+ as a fork of the
initial lightning that was implemented fully as macros. That had the
major problem of having everything visible, and making it very hard
to create temporaries or complex code.

  In case you do not know, I have been working on a simple C/C++
like language that uses lightning to generate code. See the
lib/oemit*.c files in https://github.com/pcpa/owl/

  To be easier to use, lightning should have some kind of "utilities"
interface, that probably would not be fully interchangeable with
calling plain lightning macros. What I considered was a new
pattern of functions, and code would look like this:

jit_value_t *a = jit_word(1);
jit_value_t *b = jit_float(2);
jit_addv(a, a, b);

that would resolve types, and make it so that the programmer would
not really know about registers. This would be a very simplified
implementation of the jit generation in the "owl" language. And one
thing it would allow easily would be nested jit_finish* calls.

The above jit_word call could have variants like:
jit_local_c(int initial_value, char *name);

and that would make it viable to create a gdb interface where one
could query a named variable of type (c)har when in a jit function
frame.

As long as using only base types, no garbage collector, and no
complex optimizations, it is quite doable in a short time.

  Anyway, I suggest looking a bit in
http://git.savannah.gnu.org/cgit/lightning.git/tree/check/lightning.c
and the *.tst files there to have an ideia of what is close to the
simplest usage of lightning to implement a language parser. From
there could come some better insights on how to improve
lightning :-)

> Lightning is incredibly important for the whole GNU project, more important
> then GCC I think, and so we need to get the details right because it's going
> to affect hundreds of projects.
>
> Best wishes
> Ian

  Thanks,
Paulo

> On Tue, Jul 29, 2014 at 12:39 PM, Paulo César Pereira de Andrade
> <address@hidden> wrote:
>>
>> 2014-07-26 16:10 GMT-03:00 Ian Grant <address@hidden>:
>> > This one cost me eight hours of my time.
>>
>>   I pushed 3 commits that should address the issues you noticed:
>>
>>
>> http://git.savannah.gnu.org/cgit/lightning.git/commit/?id=0d96d24073bbe42054d7de6a7739b9db07d54e96
>> o Adds a note needing to call jit_set_memory_functions before init_jit
>>
>>
>> http://git.savannah.gnu.org/cgit/lightning.git/commit/?id=a8c180a926638ba6a4967b2a60dbea994c1f15cc
>> o Remove the jit_progname global, but not advertised, variable
>>
>>
>> http://git.savannah.gnu.org/cgit/lightning.git/commit/?id=0d9ac79a128524a57b951dc90f928e56531632b7
>> o Do not pass NULL as argument to free (the wrapper checks for NULL),
>>    memcpy and memmove (new wrappers checks for 0 size argument).
>>
>> > Here is a program which reproduces the problem, an example of the
>> > output,
>> > and a patch to lightning.c which adds some debug printfs which might
>> > help
>> > you find the problem. I suspect it´s in jit_dataset. I have tried to fix
>> > it
>> > but the code there is incomprehensible to me. It seems to test flags
>> > like
>> > _jitc->no_data, and on the basis of this, it makes assumptions about the
>>
>>   I also added a comment about something that could be misleading
>> when reading the code, as in your patch:
>>
>> -    _jit->note.length = 0;
>> +        _jit->note.length = 0; // if there was no_note, would you
>> expect there to be length and size?
>>
>> and it was actually initialized to 1, because the first entry is always
>> allocated, but it may need to be reverted if after jit_init(), the user
>> tells lightning to not generate annotations; those are really only useful
>> when debugging code generation, as the "builtin" disassembler can
>> make debugging a lot easier than with a plain (gdb) x/<num>i, e.g.
>> one can add notes with jit_note() for something other than mapping
>> source code to native code, but to also just add code markers when
>> debugging the code calling jit_* macros/functions.
>>
>> > values of pointer variables. But flags like this are just another
>> > possible
>> > source of inconsistency. It´s almost always better to test the pointers
>> > directly. There are also places in this function where it calls memcpy
>> > and
>> > copies to a NULL pointer.
>> >
>> > There is another problem in that the manual indicates one can use
>> > separate
>> > _jit states concurrently, but this is not true because jit_new_state
>> > uses a
>> > global variable. The result is a space leak if it´s called twice.
>> > Another
>> > problem is that init_jit keeps a pointer to the program name, but the
>> > manual
>> > doesn´t state this. It should either warn about this in the manual, or
>> > better, use strncpy to keep a private copy of the string.
>> >
>> > Ian
>>
>> Thanks,
>> Paulo
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]