chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] Benchmark for "values" - interesting results


From: Jörg F . Wittenberger
Subject: [Chicken-hackers] Benchmark for "values" - interesting results
Date: 20 May 2013 22:40:49 +0200

Hi,

so far I lived in the believe that returning multiple values would be rather
slow under Chicken.  This goes back to a remark from Felix, which is several
years old.

(However I personally love using multiple values mostly for clarity of the
code; but also because I know that some Scheme implementations have
zero overhead, even less that explicitly passing a receiver continuation.)

Turns out that Felix is correct even years later.

Maybe it's worth to consider to change the way arguments and return values
are passed in Chicken.  I know this would be the usual hell to implement.
But: I'm kinda familiar with RScheme's internals.  It does not waste
a fresh location for each parameter, thereby lowering the load in
garbage collection.  It just has a (list of) vector-like object holding
the parameters and return values.  Passing parameters (including the
[hidden] current continuation) is a matter of storing them in the array
and adjusting the current position and current top.
Returns are handled likewise.

Doing so in Chicken should be possible as well.  (Though the FFI code would
have to copy them onto the return stack to keep the interface as is.)
This should save a lot of minor gc's since most procedures return no
more values than they consume.  And call/cc is rare enough.


To the benchmark I did: I took "partition" from SRFI-1 to be a nice
example and ran the following tests.  The resulting times are in
the "profile.ods" file attached.  Code attached as "partition.scm".

A) In csi
B) compiled with -O3
C) compiled with -O5

1) Labeled "native/values" in the spreadsheet.
  The code as copied from srfi-1.scm.
2) "adhoc 1": using an ad-hoc alternative of values/call-with-values
  in portable Scheme
3) "adhoc 2": a second ad-hoc alternative, minor modifications.
4) "Novals/1": Converted into CPS avoiding "values" and "receive"
   a.k.a. "call-with-values".
5) "Novals/2": A slightly re-arranged version from (4) saving a "let".
6) "Iterative": a procedural version; this one drops the "share common
  tail" property.  (IMHO not required by SRFI, just done by the code.)

To summarize the results:

* Version (1) is the looser here.
* (2) and (3) are slightly (around 10%) faster.
 Under -O5 the difference disappear mostly.
- That is, (2) and (3) run slightly slower under -O5 while (1) gains a little.
 - Notice that both ad-hoc implementations are *only two lines of code*,
   one for "values" and one for "call-with-values".  Since they are shorter
   code wise and still a little faster than (1), I wonder why we need
   the native C code version at all.
* (4) is about twice as fast as (1).
 - Felix is correct.
* (5) is about 2% faster than (4), but not always.
* (6) beats them all hands down.  Almost 20x as fast as (1) in csi.
 Almost 9x with -O3 and slightly more than 9x with -O5.


As an immediate consequence I'd suggest to just drop this into srfi-1.scm
in place of the "partition" code:


;; In constract to the SRFI-1 implementation, this one does not share
;; a common tail.
(define (partition pred lst)
 (let ((t (cons #f '()))
        (f (cons #f '())))
   (let ((tl t) (fl f))
     (do ((lst lst (cdr lst)))
          ((null? lst) (values (cdr t) (cdr f)))
        (let ((elt (car lst)))
          (if (pred elt)
              (let ((p (cons elt (cdr tl))))
                (set-cdr! tl p)
                (set! tl p))
              (let ((p (cons elt (cdr fl))))
                (set-cdr! fl p)
                (set! fl p))))))))


Best regards

/Jörg




..............











Attachment: partition.scm
Description: partition.scm

Attachment: profile.ods
Description: profile.ods


reply via email to

[Prev in Thread] Current Thread [Next in Thread]