redoing SCM representation in 2.2

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

redoing SCM representation in 2.2

From:	Andy Wingo
Subject:	redoing SCM representation in 2.2
Date:	Thu, 12 May 2011 12:17:56 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Hello list,

I'm looking at new SCM representation and tagging possibilities in 2.2.
Read the whole mail please, as it's a little complicated.

The current system has a couple of problems:

1) You can only tell a pair by dereferencing the pointer and checking
the cell's low bits, and then we need complicated tricks to set the
pair's bits on the value that is in the car.

(This strategy was necessary with the old GC, but is not needed
with the BDW GC.)

2) You can only tell a procedure by dereferencing the pointer and
checking the cell's word.

3) Doubles are allocated on the heap. This is, as my grandmother
would say, turrible.

I would like to revisit the SCM representation and tagging scheme in
2.2. In particular, I would like to look at NaN-boxing. I explain the
plan a bit below, but if you like to get your information depth-first,
check out:

http://blog.mozilla.com/rob-sayre/2010/08/02/mozillas-new-javascript-value-representation/

The concrete implementation strategy that I would like to use is
JSValue, from WebKit's JavaScriptCore (JSC):

http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/JSValue.h

http://trac.webkit.org/browser/trunk/Source/JavaScriptCore/runtime/JSValueInlineMethods.h

(Almost makes you want C++, doesn't it? :)

* * *

Recall, a double is laid out like this:

<- msb lsb ->

sign: 1 bit
| exponent: 11 bits
| | mantissa: 52 bits
------------------------------------------------------------------
0 00000000000 0000000000000000000000000000000000000000000000000000

This particular value represents 0.0. The special values +inf.0 and
-inf.0 are, respectively:

0 11111111111 0000000000000000000000000000000000000000000000000000
1 11111111111 0000000000000000000000000000000000000000000000000000

Any other bit pattern with all bits set in the exponent is a NaN. But
in practice, hardware and software will only generate one particular NaN
value:

> (number->string (bytevector-u64-native-ref #f64(+nan.0) 0) 2)
$8 = "111111111111000000000000000000000000000000000000000000000000000"

Which is:

0 11111111111 100000000000000000000000000000000000000000000000000

So that in practice, we have 51 bits into which we can stuff data, if we
set the MSB of a NaN. If we assume a 48-bit payload, then:

1 11111111111 1XXX_______________________________________________

So 8 tags there.

We also have any other combination with the NaN prefix that sets any
bit. If we assume a 48-bit payload, that gives us

0 11111111111 XXXX_______________________________________________

with the values 1000 and 0000 reserved, so 14 usable tags there.

* * *

Now, if you're like me, you probably have two big questions that end in
exclamation marks:

1) 64 bits? That's wasting too much memory on 32-bit systems!

2) 48 bits? That's not enough for a pointer on 64-bit systems!

With regards to (1), I can only agree. However, as it gives us more
tags, we can represent more things as immediates (besides doubles,
foreign pointers would be one example), and we can shave off the initial
word from some other structures, so perhaps that's not all bad.

Additionally, we can limit the payload to 32-bits on 32-bit systems, so
that a type check is a simple 32-bit load, and payload extraction is
also a 32-bit load, so that's probably faster than the current thing.
(This is the JSVALUE32_64 case in JSValue, linked below. It has been
found that comparing 8-bit tags actually takes more time than comparing
32-bit tags; see https://bugzilla.mozilla.org/show_bug.cgi?id=549143 for
the gory details).

Regarding (2), may I commend the following diagram to your perusal:

http://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details

Basically, current x86-64 chips restrict addresses to a 48-bit space.
Other architectures are not so restrictive:

http://developers.sun.com/solaris/articles/solaris_memory.html

Note that on x86-64, Solaris will give user addresses in the
0xffff... range, unlike Linux or Windows, which only allocate the bottom
47 bits to userspace. On Sparc64, there is no architecture-imposed VMA
hole as there is in x86-64.

Basically I think it's OK to restrict the Scheme heap to be within a
48-bit space, at least for the next decade or so. But given that the
total address space is more than 48 bits on many architectures,
arbitrary immediate foreign pointers may not be possible on
e.g. Sparc64.