chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] The CHICKEN Gazette, issue 14


From: Felix
Subject: [Chicken-users] The CHICKEN Gazette, issue 14
Date: Tue, 30 Nov 2010 07:19:16 -0500 (EST)

     _/_/_/  _/        _/            _/
  _/        _/_/_/          _/_/_/  _/  _/      _/_/    _/_/_/
 _/        _/    _/  _/  _/        _/_/      _/_/_/_/  _/    _/
_/        _/    _/  _/  _/        _/  _/    _/        _/    _/
 _/_/_/  _/    _/  _/    _/_/_/  _/    _/    _/_/_/  _/    _/

--[ Issue 14 ]------------------------------------- G A Z E T T E
                               brought to you by the Chicken Team


== 0. Introduction

Welcome to issue 14 of the Chicken Gazette, slapped together at the
last minute by Felix.

== 1. The Hatching Farm

  * Ivan Raikov is still hacking on this 9ML-toolkit, that reminds      
    me of wanting to ask him what this is all about... - it certainly   
    sounds like something that should be interesting for programming    
    language buffs; also some work has been done on signal-diagram;     
    Ivan fixed and tagged csv to reflect changes in the regexp API.     
  * Alan Post worked furiosly on genturfa'i. This turned up some        
    problems with the size of code generated by the PEG compiler, which 
    where discussed on `chicken-users` (see below).                     
  * Felix Winkelmann (that's me) did some more work on the sequences    
    extension, adding SRFI-42 comprehensions kindly contributed by      
    Thomas Chust; this egg is still in a very experimental state;       
    Felix (me) also started frobbing the fps egg in his quest           
    (triggered by Ivan) to write graphical backends for the FPS         
    egg, which would be really nice to have; system has been tagged     
    0.5 with a dependency having been fixed; as if he hadn't enough     
    things to do, he also ported the CHICKEN version of the Stalin      
    (http://cobweb.ecn.purdue.edu/~qobi/software.html) Scheme->C        
    compiler to CHICKEN 4: find it at this place.                       
  * Peter Bex tagged version 3.6.2 of the postgresql extension, fixing  
    a bug and doing possibly awesome things - if I just had a clue      
    about databases...; Peter fixed bugs and tagged multidoc as well    
    (he's everywhere).                                                  
  * Peter Lane added the quaternion egg. Sounds like it has to do with  
    maths, so I wouldn't know much about it. He also tagged version 0.3 
    of statistics.                                                      

== 2. Core development

  * Disabled the "scrutinizer" warning for branches in conditionals     
    (that occur in tail-position and which are undefined) because it    
    was triggered in many cases where this makes perfect sense - this   
    needs some more thought and will possibly be made optional; this    
    feature was requested by Joerg Wittenberger                         
  * Some minor tweaks and optimizations                                 

Some work has been done on the "fat-symbols" branch which adds a
additional slot to symbol objects containing a precomputed hash value.
This is required for something called "lazy gensyms", a clever trick
(by R. Kent Dybvig, not by me, of course) to create names of "gensym"
(uninterned generated symbols used heavily in syntax-expansion and the
compiler) only on demand, that is, when they are printed or the name
is explicitly requested. Everything seems to work but the performance
improvement is disappointing - it seems to be that hashing symbol-names
has neglible cost and that the additional memory requirement for the
extra slot possibly negates any improvements gained by the reduced
symbol-name generation. Needs more benchmarking.

The change-request for `umask` (umask support
(http://bugs.call-cc.org/ticket/424)) has been discussed a little and
will (hopefully) be voted upon soon. We waited long enough for this
ground-breaking feature!

== 3. Chicken Talk

Alan Post is putting final touches on
his genturfahi packrat parser and asked
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00168.html)
about the most efficient way to store character offsets
into an utf8 string. Various people responded to this,
posting ideas or even substantial code examples (like this
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00169.html)
by the one and only Joerg Wittenberger). Felix also
took part in this discussion but quickly became confused
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00193.html),
forgetting the original problem - well, that's how we know him, eh, me.
Boy, this is confusing... Anyway, this is still an open issue, since
working with arbitrary sub-sections of strings without requiring a copy
of the subsequence is a reappearing problem.

Alan also asked
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00185.html)
whether an extension can install both a dynamically loadable library
and an executable with the same name. Well it turns out, you can!

Unfortunately the large bodies of code generated by the
parser results in excessive compile-times, as reported by Alan
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00202.html).
This is really critical, and he works heavily on changing the generated
code patterns to reduce the amount of code. Wether this is just
inherent in the code-generation style or whether chicken just does a
bad job at optimizing this still has to be resolved. As Alan kindly
pointed out, the fellas on `#chicken` did their best in suggesting
improvements.

David Dreisigmeyer reported problems using Paredit
(http://www.emacswiki.org/emacs/ParEdit) with the `cluck` or `quack`
emacs extensions - it's unclear what can be done here. Perhaps a reader
can suggest something? If yes, drop into `#chicken` or write on the
mailing list.

Hugo Arregui posted code for defining "memoizing"
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00216.html)
procedures and many people suggested improvements and gave comments.

And last but not least Daishi Kato asked
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00220.html)
for help with a shortcoming in the csv egg, which Ivan quickly fixed
(http://lists.nongnu.org/archive/html/chicken-users/2010-11/msg00223.html).

== 4. Omelette Recipes

This week I'm going talk about the fast-generic extension, a
fascinating little tool for generating efficient dispatching code
for generic functions. The egg is a port of Thant Tessman's `generic`
code, written originally for `Chez Scheme`. It defines two macros for
defining types and declaring generic functions over these types. This
is best described by an example:

    (define-type <list> list?)
    (define-type <vector> vector?)
    
    (define-generic (element-ref (<list> lst) i) (list-ref lst i))
    (define-generic (element-ref (<vector> vec) i) (vector-ref vec i))
    
    (element-ref '(a b c) 1)         ==> b
    (element-ref '#(99 100 101) 2)   ==> 101

`define-type` defines a named type that is characterized by a predicate
and `define-generic` creates a function definition that can be
specialized for certain argument types.

Types can have subtype-relationship by adding a third parameter to
`define-type`:

    (define-type <vector> vector?)
    
    (define (odd-sized-vector? x)
      (and (vector? x) (odd? (vector-length x))))
    
    (define-type <odd-sized-vector> odd-sized-vector? <vector>)
    
    (define-generic (pairwise (<vector> v))
      (chop (vector->list v) 2))
    
    (define-generic (pairwise (<odd-sized-vector> v))
      (error "I'm sorry, Dave, but I can't do that."))

The interesting part is when we look at the implementation of this.
Expanding the generic function definition above results in

     (##core#begin
       ;; specialized code
       (define (pairwise<<odd-sized-vector>> v)
         (error "I'm sorry, Dave, but I can't do that."))
       ;; dispatching procedure
       (define (pairwise v)
         (let* ((g155 '##fast-generic#unset) (g156 g155))
    (or (and (odd-sized-vector? v)
         (begin (set! g156 (pairwise<<odd-sized-vector>> v)) #t))
        (and (#%vector? v) (begin (set! g156 (pairwise<<vector>> v)) #t)))
    (if (#%eq? g155 g156)
      (fast-generic#generic-error
        'pairwise
        (#%list v)
        '((pairwise<<vector>> <vector>)
          (pairwise<<odd-sized-vector>> <odd-sized-vector>)))
      g156)))
       (define-compiler-syntax
         pairwise
         (lambda (x2 r2 c2)
    (if (#%eq? 1 (#%length (#%cdr x2)))
      (let* ((arg-list (#%cdr x2))
         (result (#%gensym))
         (unset (#%gensym))
         (args (#%map (lambda _ (#%gensym)) arg-list))
         (method-body
           (fast-generic-compile-time#build-or-clause
             (#%get 'pairwise 'generic-trees)
             args
             args
             result
             r2)))
        (#%list
          (r2 'let*)
          (#%cons
        (#%list
          unset
          (#%list
            (r2 'cons)
            (#%list (r2 'quote) 'unset)
            (#%list (r2 'quote) '())))
        (#%cons (#%list result unset) (#%map #%list args arg-list)))
          method-body
          (#%list
        (r2 'if)
        (#%list (r2 'eq?) result unset)
        (#%list
          (r2 'generic-error)
          (#%list (r2 'quote) 'pairwise)
          (#%cons (r2 'list) args)
          (#%list
            (r2 'quote)
            '((pairwise<<vector>> <vector>)
              (pairwise<<odd-sized-vector>> <odd-sized-vector>))))
        result)))
      x2))))

Disregarding all the internal decorations, we see that this defines
two procedures (a specialized version for this particular argument type
and a dispatching procedure that performs the type checks and calls
the suitable specialized procedure) and an ugly "compiler macro", which
will replace direct calls to the generic with the dispatching code
in-line. Interpreted code will ignore this and call the dispatching
procedure, but compiled code will perform the dispatching at the call
site, which gives the optimizer a lot of opportunities to reduce the
checks or even eliminate them completely for literal arguments.

Subsequent definitions of generic procedures will overwrite the
dispatching procedure and redefine the compiler-syntax and thus create
more differentiated dispatching code with each new generic.

This makes generic functions and the required dispatching very fast.
Another nice property is that inheritance is defined "conceptually",
not necessarily "structurally". If you need the latter, something like
coops is probably more suitable.

There are some caveats, though (as usual): the type-predicates have to
be chosen with care and should be disjoint or funny things may happen.
Also, excessive use of generic procedures can result in code-bloat.
Only direct calls will be optimized and only if they appear textually
after the definition of the generic. Non-direct calls will still refer
to the dispatcher, so they will work. Since the types and generics
are registered at compile-time, generic-procedures can be exported but
dispatching code can not be created outside of the current compilation
unit.

Still, for internal code that needs to dispatch this is very handy and
has been successfully used in the sequences extension to define generic
operations on various sequence-like data types.

For more information about this, check out Thant Tessman's
paper on the original implementation, which can be found here
(http://chicken.kitten-technologies.co.uk/svn/release/4/fast-generic/doc/p45-tessman.pdf).

== 5. About the Chicken Gazette

The Gazette is produced weekly by a volunteer from the
Chicken community. The latest issue can be found at
http://gazette.call-cc.org or you can follow it in your
feed reader at http://gazette.call-cc.org/feed.atom. If
you'd like to write an issue, check out the instructions
(http://bugs.call-cc.org/browser/gazette/README.txt) and come and find
us in #chicken on Freenode!

[ --- End of this issue --- ]



reply via email to

[Prev in Thread] Current Thread [Next in Thread]