[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Dotgnu-libjit] using Libjit with boehm collector
From: |
Basile Starynkevitch |
Subject: |
Re: [Dotgnu-libjit] using Libjit with boehm collector |
Date: |
Wed, 26 Sep 2012 16:22:41 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, Sep 26, 2012 at 03:49:17PM +0200, Tommaso Tagliapietra wrote:
> Hi guys.
>
> > > However I've seen that Libjit uses standard malloc/calloc/free and
> > > mmap/VirtualAlloc and I don't understand how this can work with the
> > > collector.
> >
> > libjit is using standard malloc... for its own dynamic memory management,
> > hence in
> > particular it does not require Boehm's GC to be able to work.
>
> Right...I was wrong in explanation. In few words my question is: if
> libjit uses standard allocation functions, how can Boehm GC, used for
> other parts of my interpreter, scan the memory allocated by libjit?
>
> > I believe you don't have to bother (but I might be wrong). For such static
> > references,
> > you could also for safety either keep some (indirect) reference on the
> > stack
> > (i.e. by having closures in your langauge which explicitly references
> > them)
> > or make them explicit GC roots (by calling GC_add_roots on your segment
> > of static data containing them).
>
> Yes I think that the use of GC roots will be one solution but I'm
> afraid that registering a lot of new roots may slow down the
> collector (and the interpreter then) or reach a situation where
> everything is uncollectable and uncontrollable.
I am not sure that registering new roots will slow down the GC.
It might on the contrary speed it up a tiny bit. You need to measure
to be sure. Of course you want to register your data segments as new roots
(typically segments of several kilowords aligned at a page boundary,
i.e. at 4Kbytes).
>
> Actually the bytecode that I use is an array of opcodes mixed to
> integers and addresses. For example:
>
> [0] --> PUSHIR
> [1] --> <memory address>
> [2] --> CALL, 1
> [3] --> JMP
> [4] --> <offset>
> [5] --> MAKE_CLOSURE
> [6] --> <memory address>
> [7] --> RETURN
>
> Ok this is not a usefull example, but may explain how function
> bytecode is stored. Another thing that I want to avoid is to invoke
> the generated function with an extra parameter to pass some local data
> array.
I am not sure you want to avoid that. Generally, closure's function are invoked
with the closure's data as (implicit) parameter.
You could put the 0xABCDEF01 or 0x01EFCDAB pointers in a
reachable "object" from the closure's data. Most functional language
(Lisp or Scheme or ML or Haskell-like) implementations are doing that.
C.Queinnec book on Lisp in Small Pieces is a good reading.
If the pointers in the bytecode are at word boundaries,
and if the bytecode is allocated with GC_malloc (not GC_malloc_atomic) or is a
root
then they will be found by Boehm's GC.
>
> Then, if I make a new function with libjit and make a local pointer
> variable giving it a fixed address, and this pointer is the unique
> reference to an object allocated with Boehm GC, there a way to protect
> this object from garbage collection?
It won't be collected by Boehm's GC in that case, since a stack pointer points
to it
(your local pointer).
>
> In C is similar to:
>
> 0xABCDEF01 := the address of an object allocated with Boehm
> 0x01EFCDAB := another one.
>
>
> Object *MyFunction(Object *A, Object *B) {
> Object *data = 0xABCDEF01;
>
> /* ... some code using or not 'data' */
>
> return 0x01EFCDAB; // another pointer.
> }
The only issue I might see is if libjit is producing such machine code,
and if the 0xABCDEF01 or 0x01EFCDAB are not word-aligned or are byte-reversed.
Hence my advice to either keep the bytecode in GC-ed memory, or store these
pointers somewhere (root data segment, local, closure ...)
>
> My bytecode that simulate MyFunction protect the objects since the
> bytecode array is allocated via Boehm. How can I implement this with
> libjit?
If you keep the bytecode (in GC allocated & reachable memory),
and if it is allocated with GC_malloc not GC_malloc_atomic,
and if the addresses 0xABCDEF01 or 0x01EFCDAB appearing inside are word aligned,
they will be followed by the GC.
I just think that you should understand a bit better how Boehm's GC actually
work.
Did you read some documentation (or code comments) about it?
Cheers
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***