g-wrap-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

On C SEXP representation [was: Initial Thoughts]


From: Andreas Rottmann
Subject: On C SEXP representation [was: Initial Thoughts]
Date: Wed, 31 Dec 2003 19:58:37 +0100
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (gnu/linux)

>From Rob's mail:

,----
| * One thing that's somewhat awkward in g-wrap right now is the way
|   that you have to construct the C code that's eventually going to
|   end up in the relevant wrappers.  Right now, you just return a
|   tree of strings (and possibly a few "magic" symbols) and that
|   string tree is eventually flattened and dumped to the generated C
|   files.  In addition to being awkward to construct and read, the
|   string tree approach leaves you with an output representation
|   that's more or less opaque.
| 
|   One alternative I've considered (which fits in with some random
|   speculations I've had wrt guile itself) is whether or not it would
|   be helpful to introduce a "sexp representation for C" (CSE) to
|   g-wrap.  i.e. to define a sexp grammar that allows you to easily
|   represent all (or maybe just a relevant subset) of C, that's easy
|   to render to C at output time, 
[...]
| 
|   Along these lines, I've actually written a simple grammar that
|   should represent all of C (at least according to the ANSI C
|   grammar), along with an associated renderer, but it's just a toy
|   right now.
| 
Could you please share the code? I might come up with a parser based
on (parse lalr) from [0], which is , based on the LALR parser
generator on [1].

[0] http://www.vzavenue.net/~rwtodd5128/index.html
[1] http://www.iro.umontreal.ca/~boucherd/Lalr/documentation/lalr.html

I'll try writing a replacement for h2def.py (used by guile-gobject)
using lalr, to get up to speed with it.

| 
| * On another topic, that of automatic wrapper generation, one of the
|   tricky bits is how you get *reliable* API information.  GTK has
|   avoided this problem by (quite nicely) providing an easily
|   parsable spec.
|
Note that the specs are not really used; at least PyGTK2 and
guile-gobject use the above-mentioned h2def.py script to extract API
information from the headers. This is a rather simple task, since the
headers are written in a quite regular way (so IMHO they solved the
problem by using a coding convention).

|  However, in cases where such information isn't readily available,
|  one thing people often consider is parsing the headers themselves.
|  I believe SWIG does this, 
|
Yes, SWIG can parse simple headers, but when I last looked at it,
you'd better write a separate .i (interface) file for
not-so-straightforward APIs.

|  but I've always been wary of that approach because unless the
|  parser you're using to parse the headers is *identical* to the one
|  you're going to eventually use for compilation, you can't be sure
|  they'll interpret the headers the same way (using the same search
|  paths, same __foo__ extensions, same defines, etc.).  After a
|  thinking a bit I realized that you could alleviate much of the
|  uncertainty by just requiring the user to preprocess any header
|  before analysis, with the same compiler and the same options that
|  they intend to eventually use during compilation.  Of course you'll
|  still have to have some way to indicate *which* functions you're
|  interested in wrapping, since you'll be likely to pull in a whole
|  bunch of irrelevant prototypes during the preprocessing.
| 
Hmm, I don't know how much of a problem this is with "clean" headers
that don't mess around with preprocessing to much. E.g. the "parser"
in h2def.py does its job mainly by regexp-munching the .h files...

|   While considering the above, someone suggested I might want to
|   look at CIL (http://manju.cs.berkeley.edu/cil/).  I checked it out
|   and talked with the author for a bit.  It sounds like it might
|   well allow arbitrarily sophisticated analysis, including the
|   fairly simple extraction of prototypes of interest and definitions
|   of arbitrarily complex types.
| 
|   To some extent CIL ties in with my speculations about CSE above.
|   If you have some way to translate from C to a syntax that's easier
|   to work with (like sexps), it may be a lot easier to do many kinds
|   of fancy analysis and manipulation (precise gc, redundant type
|   check elimination, etc.).
| 
I've also looked a bit into CIL, and it should indeed be not difficult
to come up with a piece of code that generates CSE. However, I don't
know OCAML, and I'd prefer not to rely on an external tool that
requires yet another language runtime...

| 
| * I have also spoken to one of the people working on OpenMCL (a
|   good, and now free, implementation of common lisp).  They use a
|   modified version of the ffigen project to generate their C FFI.
|   They can translate from C headers to a sexp C API rep that they
|   then can easily manipulate from lisp.  I believe the OpenMCL tool
|   differs from the older fffigen in large part by using gcc code for
|   the parsing instead of lcc.
| 
|     http://openmcl.clozure.com/Doc/interface-translation.html
|     http://www.ccs.neu.edu/home/lth/ffigen/
| 
|   Some notable points about my conversation with with the openmcl
|   developer:
| 
|     - they would be very interested in working with us on breaking
|       out their parser as a standalone .h->sexp converter and
|       adjusting the sexp API syntax if necessary to be both common
|       lisp and scheme friendly.  He may also have some interest from
|       other common lisp groups.
| 
Yes, that would be a nice tool indeed.

|   If we did decide we were interested in working on this project,
|   I've wondered whether or not CIL (above) might be able to provide
|   a more comprehensive foundation for the C->sexp translation, but I
|   haven't had time to investigate any further.
`----

The vague worry I have about using CIL is that it might be not expressive
enough for conveniently writing CSE that will be spit put by g-wrap,
e.g. consider this (from [2]): 

   CIL removes all local scopes and moves all variables to function
   scope. It also separates a declaration with an initializer into a
   declaration plus an assignment. The unfortunate effect of this
   transformation is that local variables cannot have the const
   qualifier.

[2] http://manju.cs.berkeley.edu/cil/cil012.html

Cheers, Andy
-- 
Andreas Rottmann         | address@hidden      | address@hidden | address@hidden
http://yi.org/rotty      | GnuPG Key: http://yi.org/rotty/gpg.asc
Fingerprint              | DFB4 4EB4 78A4 5EEE 6219  F228 F92F CFC5 01FD 5B62

Make free software, not war!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]