dotgnu-libjit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Dotgnu-libjit] using Libjit with boehm collector


From: Basile Starynkevitch
Subject: Re: [Dotgnu-libjit] using Libjit with boehm collector
Date: Wed, 26 Sep 2012 16:22:41 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, Sep 26, 2012 at 03:49:17PM +0200, Tommaso Tagliapietra wrote:
> Hi guys.
> 
> > > However I've seen that Libjit uses standard malloc/calloc/free and
> > > mmap/VirtualAlloc and I don't understand how this can work with the
> > > collector.
> >
> > libjit is using standard malloc... for its own dynamic memory management,
> > hence in
> > particular it does not require Boehm's GC to be able to work.
> 
> Right...I was wrong in explanation. In few words my question is: if
> libjit uses standard allocation functions, how can Boehm GC, used for
> other parts of my interpreter, scan the memory allocated by libjit?
> 
> > I believe you don't have to bother (but I might be wrong). For such static
> > references,
> > you could also for safety either keep some (indirect) reference on the
> > stack
> > (i.e. by ha‌ving closures in your langauge which explicitly references
> > them)
> > or make them explicit GC roots (by calling GC_add_roots on your segment
> > of static data containing them).
> 
> Yes I think that the use of GC roots will be one solution but I'm
> afraid that  registering a lot of new roots may slow down the
> collector (and the interpreter then) or reach a situation where
> everything is uncollectable and uncontrollable.

I am not sure that registering new roots will slow down the GC. 
It might on the contrary speed it up a tiny bit. You need to measure 
to be sure. Of course you want to register your data segments as new roots 
(typically segments of several kilowords aligned at a page boundary, 
i.e. at 4Kbytes).
> 
> Actually the bytecode that I use is an array of opcodes mixed to
> integers and addresses. For example:
> 
> [0] --> PUSHIR
> [1] --> <memory address>
> [2] --> CALL, 1
> [3] --> JMP
> [4] --> <offset>
> [5] --> MAKE_CLOSURE
> [6] --> <memory address>
> [7] --> RETURN
> 
> Ok this is not a usefull example, but may explain how function
> bytecode is stored. Another thing that I want to avoid is to invoke
> the generated function with an extra parameter to pass some local data
> array.

I am not sure you want to avoid that. Generally, closure's function are invoked 
with the closure's data as (implicit) parameter. 
You could put the 0xABCDEF01 or 0x01EFCDAB pointers in a 
reachable "object" from the closure's data. Most functional language 
(Lisp or Scheme or ML or Haskell-like) implementations are doing that.
C.Queinnec book on Lisp in Small Pieces is a good reading.

If the pointers in the bytecode are at word boundaries, 
and if the bytecode is allocated with GC_malloc (not GC_malloc_atomic) or is a 
root
then they will be found by Boehm's GC.

> 
> Then, if I make a new function with libjit and make a local pointer
> variable giving it a fixed address, and this pointer is the unique
> reference to an object allocated with Boehm GC, there a way to protect
> this object from garbage collection?

It won't be collected by Boehm's GC in that case, since a stack pointer points 
to it
(your local pointer).

> 
> In C is similar to:
> 
> 0xABCDEF01 := the address of an object allocated with Boehm
> 0x01EFCDAB := another one.
> 
> 
> Object *MyFunction(Object *A, Object *B) {
>    Object *data = 0xABCDEF01;
> 
>    /* ... some code using or not 'data' */
> 
>    return 0x01EFCDAB;  // another pointer.
> }

The only issue I might see is if libjit is producing such machine code, 
and if the 0xABCDEF01 or 0x01EFCDAB are not word-aligned or are byte-reversed.
Hence my advice to either keep the bytecode in GC-ed memory, or store these
pointers somewhere (root data segment, local, closure ...)

> 
> My bytecode that simulate MyFunction protect the objects since the
> bytecode array is allocated via Boehm. How can I implement this with
> libjit?

If you keep the bytecode (in GC allocated & reachable memory), 
and if it is allocated with GC_malloc not GC_malloc_atomic, 
and if the addresses 0xABCDEF01 or 0x01EFCDAB appearing inside are word aligned,
they will be followed by the GC.

I just think that you should understand a bit better how Boehm's GC actually 
work.
Did you read some documentation (or code comments) about it?

Cheers
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***



reply via email to

[Prev in Thread] Current Thread [Next in Thread]