chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] object-evict, string ports, safe-foreign-wrapper, foreig


From: Tony Sidaway
Subject: [Chicken-users] object-evict, string ports, safe-foreign-wrapper, foreign-primitive, Cheney on the Victoria Line, etc (was: What happens to a (non-simple) Scheme object sent to a foreign function?)
Date: Sun, 4 Feb 2007 20:25:30 +0000

On 2/4/07, Thomas Christian Chust <address@hidden> wrote:
Tony Sidaway wrote:

> I'm sending a Scheme string to a foreign (C) library as a c-string.
> I also send it the address of a Scheme procedure created as
> define-external--this address is sent as a c-pointer.
> [...]

Hello,

the address of the C function wrapping the define-external'ed Scheme
procedure is unproblematic, because it will never change. The pointer to
the C string data may become invalid, though, once the program returns
from the library call. In order to make this safe, you would have to
duplicate the string in the C heap (for example using strdup) or you
would have to create a non-garbage-collected copy of the Scheme string
(for example using object-copy) to pass that to the library routine. In
either case you would have to release the string data later on when it
is no longer needed.


Thanks. Would "object-evict" achieve the same end? I did try that but
the result was that my callback function received a corrupted string,
which suggests that I was probably doing something else wrong.  I
think at that stage I'd forgotten to declare a procedure as
safe-foreign-lambda so the callback might have been executing in a
rather odd closure..

For management purposes, I suppose I could tie  the non-garbage
collected object   (constructed using object-copy or whatever) to
other related resources by wrapping them all in a record (either
SRFI-9 or chicken) and establishing an appropriate finalizer for the
record using "set-finalizer!"  This would free the relevant
non-managed resources.

The string argument I mentioned in my earlier email was being used as
a key to a SRFI-69 hash table into which character or binary output
from the library callback was to be accumulated in successive
callbacks.  The library is libcurl and the callbacks are intended to
do something analogous to

1. Tell library the FILE* to which to write data received from the URL
as a result of a curl_easy_perform
2. (Optional:) provide a C function to take the place of the default
writer, which writes successive chunks of data to the file.

The libcurl API permits the handle passed in step 1 to be an arbitrary
C pointer, and it's up to the function provided in step 2 to interpret
it correctly.  Passing an unmanaged FILE* object (such as
(foreign-value "stdout" c-pointer)) in step 1 works fine with
libcurl's built-in writedata function but does not exploit the
versatility of the API and certainly wouldn't fulfil the reasonable
needs and expectations of the Chicken programmer, who would often want
to have the option to have all received data passed to a Scheme
variable via a string port or other suitable mechanism.

Although the data may be built up in batches though several successive
invocations of the callback, this all takes place in the course of a
blocking call to curl_easy_perform(), so the operation is synchronous.

I've decided to adopt an alternative, more scheme-like strategy that
also fits the general spirit of the C implementation.  Instead of file
pointers or hash keys, I could use ports.  This is semi-okay, but of
course ports can mutate due to file access so just passing a
non-managed copy to libcurl wouldn't be a good idea.  Using file
descriptors (port->fileno) would make assumptions about the port
implementation which break down when, for instance, string ports are
used.

My current thought is this:  In step 1, I pass the library a key (in
static memory, not heap) which I associate with the port (I place the
actual port in a srfi-69 hash table under that key).  On receiving the
key, in stage 2, my write procedure (which would ideally be written in
Chicken as a safe-foreign-wrapper procedure) would pick the port out
of the hash table using that key, and write the string using a method
compatible with Chicken's port mechanism.  At its simplest, this is
"(display str port)".  This is compatible with string port procedures
such as (with-output-to-string THUNK), and so Scheme programmers would
get what they expect using a mechanism they understand well.  The
default behavior would be to write the received character data (say,
from a http get) to (current-output-port).  The programmer may provide
an alternative port or may redirect (current-output-port) to a string
port or any other kind of port.

There are still potential problems with this, that the information
received may be binary rather than characters, and that the encoding
may not be compatible with Scheme's.   With transmitted binary there
may also be the usual endian problems I'll worry about those problems
when their turns come.

Another problem that it would be helpful to have advice on is how to
cast or coerce arbitrary data received in a Chicken
safe-foreign-wrapper.  Say I receive a foreign c-pointer to some data
and a couple of parameters nmemb and size that when multiplied
together tell me the number of bytes in the foreign object, it would
be nice to have a Schemish way of converting that into a SRFI-4
u8vector of size (* nmemb size).  I've examined locations and
locatives, but those don't seem to help with that particular problem
not least because there is no underlying Scheme data object, only a
random bin containing binary bits whose significance is completely
unknown to Scheme.

Would it be best to write something to do that in C as a
foreign-primitive?  Basically a C function, declared
foreign-primitive, that takes a C-pointer and an int, and allocates a
byte vector of the appropriate size.  Presumably this would eat into
the nursery until such time as a minor garbage collection takes place.

Perhaps if size is an appropriate number of bytes we might want a
u16vector, u32vector, u64vector or whatever containing nmemb elements
(presumably this would aid in resolving the endian problem).

I hope some of these ramblings make sense.  I am vaguely aware of
Chicken's CPS design and the use of a Cheney garbage collection
algorithm, and the principle of placing the C stack frame into the
active half of the heap space.  As a consequence I may think I
understand something when I'm some considerable distance from true
comprehension.  I expect to understand the design better, given enough
time.  Perhaps there will be an asymptotic convergence.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]