emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] /srv/bzr/emacs/trunk r107377: * src/lisp.h: Improve co


From: Paul Eggert
Subject: Re: [Emacs-diffs] /srv/bzr/emacs/trunk r107377: * src/lisp.h: Improve comment about USE_LSB_TAG.
Date: Wed, 22 Feb 2012 17:20:21 -0800
User-agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 02/22/2012 12:25 PM, Stefan Monnier wrote:
>> +/* On hosts where VALBITS is greater than the pointer width in bits,
>> +   USE_LSB_TAG is:
>> +    a. unnecessary, because the top bits of an EMACS_INT are unused,
>> +    b. slower, because it typically requires extra masking, and
> 
> Is this just a gut-feeling, or is there some actual measurement behind
> this assertion?

Originally the former, but (now that you asked) the latter.

On my host the expression (benchmark 1000000) ran about 8% faster
without USE_LSB_TAG.  This benchmark merely tests the speed of 'aset';
it is defined by the source code at the end of this message (not
byte-compiled); this was a random benchmark I was using for something
else.  I benchmarked Emacs trunk bzr 107379 configured
--with-wide-int, using gcc -m32 (GCC 4.6.2), and a Fedora 15 x86-64
kernel (2.6.41.1-1.fc15.x86_64 SMP).

The executable size is measurably larger, too: src/temacs's text size
is 0.94% larger when USE_LSB_TAG is defined.

I'm pretty sure I'll get similar results with other benchmarks.
I don't see how USE_LSB_TAG could outperform !USE_LSB_TAG on my platform.


> What kind of extra masking are you referring to?  The XFASTINT?
> Note that the LSB masking can be cheaper than the MSB masking

No, it's XPNTR that's faster, because its masking comes for free --
zero runtime overhead on my platform.


> So "VALBITS is greater than the pointer width in bits" is not
> the exactly right condition (e.g. if we have 48bit pointers and 61
> VALBITS then the problem should not appear).

Most likely not, true.  The current code is conservative.  I don't
know of any real platform where the conservatism matters, though.


> Maybe a better fix is to add code to the stack marking loop, conditional
> on WIDE_EMACS_INT and USE_LSB_TAGS, which passes pointer-sized
> words to mark_maybe_object after expanding them to EMACS_INT size.

That patch would be more intrusive.  Plus, the following (further)
patch is simpler and faster.  This patch is purely a performance
improvement so I didn't install it (was planning to do it after 24.1
comes out, but if you like I can do it now....).

=== modified file 'src/alloc.c'
--- src/alloc.c 2012-01-19 07:21:25 +0000
+++ src/alloc.c 2012-02-23 01:10:15 +0000
@@ -4268,10 +4268,12 @@
       end = tem;
     }
 
+#if defined USE_LSB_TAG || UINTPTR_MAX >> VALBITS != 0
   /* Mark Lisp_Objects.  */
   for (p = start; (void *) p < end; p++)
     for (i = 0; i < sizeof *p; i += GC_LISP_OBJECT_ALIGNMENT)
       mark_maybe_object (*(Lisp_Object *) ((char *) p + i));
+#endif
 
   /* Mark Lisp data pointed to.  This is necessary because, in some
      situations, the C compiler optimizes Lisp objects away, so that



Here's the benchmark code I mentioned earlier.

(defun benchmark-with-aset (n)
  (let ((start (float-time (get-internal-run-time)))
        (v (make-vector 1 0))
        (i 0))
    (while (< i n)
      (aset v 0 1)
      (setq i (1+ i)))
    (- (float-time (get-internal-run-time)) start)))

(defun benchmark-without-aset (n)
  (let ((start (float-time (get-internal-run-time)))
        (v (make-vector 1 0))
        (i 0))
    (while (< i n)
      (setq i (1+ i)))
    (- (float-time (get-internal-run-time)) start)))

(defun benchmark (n)
  (- (benchmark-with-aset n)
     (benchmark-without-aset n)))




reply via email to

[Prev in Thread] Current Thread [Next in Thread]