emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 64-bit lossage


From: Ken Raeburn
Subject: Re: 64-bit lossage
Date: Thu, 01 Aug 2002 19:37:09 -0400
User-agent: Gnus/5.090006 (Oort Gnus v0.06) Emacs/21.1.50 (i686-pc-linux-gnu)

Paul Eggert <address@hidden> writes:
> I agree in general, but Emacs on x86 is quite popular, and the current
> 28-bit address limit is starting to bite.

Starting? :-)
Yeah, part of the reason I looked into this at all was someone at MIT
asking about using Emacs to edit a 300M file.

Unfortunately, x86 is register-starved, so I expect the performance
cost to be rather severe, and perhaps some gcc bugs to surface.  It
might not be so bad if we used 32 bits for the value, and 32 for type
and mark bits.  Then at least the math might be easier on the
compiler, and we wouldn't need any checks to make sure we didn't have
pointer values outside the 32-bit address space.  But we'd still be
pushing 64-bit values around.

>    A few years ago I started
> to work on supporting 64-bit Lisp_Object on 32-bit hosts and I got
> about half the way through.  It's not that hard, but it is a bit
> tedious (among other things, every function really needs to be
> prototyped).

That would certainly help.  But I think the work I've been doing with
using "union Lisp_Object" has shaken out a lot of the problems where
"Lisp_Object" might be confused with integral types, and a few other
people seem to be compiling that way too.

> Changing the subject slightly, can't we increase address space and
> improve performance on both 32- and 64-bit hosts, without widening
> Lisp_Object, by moving the 4 tag bits to the low-order end of the
> Lisp_Object, and ensuring that all non-Lisp_Int objects are aligned at
> a multiple of 16?  That way, we would get all of the 32-bit address
> space.  Lisp_Int addition and subtraction would still be fast, since
> the Lisp_Int code is zero.  (Multiplication and division would be
> slower, but that's rare.)  And we would speed up access to
> non-Lisp_Int objects, since the tag bits in many cases could be
> removed at zero cost during normal address arithmetic.

That would get us more of the address space to use for Lisp objects on
some platforms, but it doesn't get us bigger integers.  We still need
to be able to convert "point" and markers into lisp integers.

We could use just the low bit or two to indicate an integer, but that
also dictates what's in those low bits for all the other lisp types,
which means we probably need to steal another bit for non-integer type
tagging (five bits, or 32-byte alignment, which would greatly increase
the storage used for some smaller structures like Lisp_Cons and
Lisp_String), or move some of that type info into the pointed-to
objects, the way Lisp_Vector and Lisp_Misc each support several data
types.

Or some combination of the two, roughly like this:

   ... XXXX XX0M - integer
   ... XXXX X01M - mask off low two bits to get pointer to Lisp_Cons
   ... XXXX X11M - mask off low three bits to get pointer to struct
                   containing type tag
   M = mark bit

Then cons cells and other objects would have to be aligned on 8-byte
boundaries, and those "other" types would need an extra few bits at
the front as a type tag, like Lisp_Misc has already, and Lisp_Vector
sort of does in the high bits of the size field.

We might get another bit by moving the GC mark bit into the pointed-to
storage, but then cons cells might have to grow.  (You need to be able
to mark a cons cell containing two integers, somehow.)

If the mark bit is moved outside the object entirely -- for example,
use a block of mark bits that correspond to a collection of objects,
with the mark bits being located by doing arithmetic on the object
address -- then the storage still grows, but perhaps a bit more
efficiently.  Locating the mark bit is more expensive, but we get back
a bit in the Lisp_Object representation, and when we pull an object's
mark bit into the CPU cache, we get mark bits for nearby objects as
well; that may gain back a little performance during the mark phase,
and the sweep phase wouldn't have to touch the objects at all.
(Considering the high cost of cache misses relative to arithmetic in
terms of CPU cycles these days, it might even be a net benefit; I
haven't looked at the literature on this issue in a while.  Extending
this to multiple pages per "block" might reduce swapping in some cases
too, I'm not sure.)

There are other games that might be played by address manipulation.
For example: Allocate groups of lisp object structures on page
boundaries, with all the current primary object types each being
allocated on separate pages.  Mask off the low bits of the address,
and look at bookkeeping information at the start of the page, which
includes the type of the objects stored on that page.  Then the type
information doesn't need to be stored per object.


I'm not too inclined to tackle this myself right now.  I'm happy to
discuss it, and perhaps help out in some small ways if someone else
wants to work on it, but my focus is on the Guile work.  And if the
low level details of the Lisp system wind up being thrown out as part
of that work, any effort put in on them would be discarded.

Ken



reply via email to

[Prev in Thread] Current Thread [Next in Thread]