chicken-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-hackers] Made a start with CHICKEN 5 proposal


From: Felix Winkelmann
Subject: Re: [Chicken-hackers] Made a start with CHICKEN 5 proposal
Date: Tue, 09 Sep 2014 12:05:09 +0200 (CEST)

>> * Designing a decent POSIX API is a hard task. I have not seen any
>>   reasonably good API wrapper for that yet - they are either too
>>   lowlevel (Basis, Ocaml, etc.), or too highlevel.
> 
> For now a modest refactoring would be enough.
> 
> [begin of short brain dump about the POSIX situation]
> 
> Putting things like, for example, "directory" in some other unit would
> make more sense to me, because there's nothing inherently POSIXy in
> reading the contents of a directory. (though the _implementation_
> happens to rely on the C POSIX API, of course), and I think it belongs
> with make-pathname and friends (ie, a "paths" or "files" module).
> 
> Ideally, there wouldn't be much left of the "posix" unit except some
> deeply POSIXy things like fork, signal, fcntl, environment vars etc.
> Probably this means the really high-level things move elsewhere.
> In time, we might even move the POSIX unit out of core into an egg
> and keep only truly "portable" (or essential) things in core.  I'm
> not sure what will happen to POSIX in the future, but I think its
> hegemony will end sooner rather than later.  the landscape is shifting
> so quickly with these mobile devices (think Windows Phone, Firefox OS
> but also the crippled POSIX support on iOS and Android), OS research
> is slowly picking up again and the Linux crowd seems to be taking an
> increasingly aggressive stance against "backwards compatibility" (think
> Wayland, systemd etc).

Quite true.

>> * Changing the string representation is much harder than you think
>>   (quoting John: "If Chibi can do it, so can we" completely ignores
>>   the fact that writing a string-representation implementation from
>>   scratch is something vastly different than modifying an existing
>>   one, one that is much older and much more widely used from
>>   foreign/native code.)
> 
> Agreed.  Recall that my suggestion was simply to "bless" UTF-8 as the
> canonical internal representation (which is the case, de facto, anyway)
> and *maybe* adding some detection code to reject invalid sequences rather
> than just continuing with bogus data.  Possibly making the default
> string ops the ones from the UTF-8 egg.  Anything beyond that is
> overkill and I would definitely not support changing the encoding in
> this effort.

I basically agree, but please note that UTF8-aware string-mutation
would have to invole "become!", which is very inefficient.

>> * Numeric tower support: this is also hard, and will have a
>>   considerable performance impact, needs changes in the compiler, in
>>   all the icky C glue code and particularly in foreign code - which
>>   means things will break all over the place in user code.
> 
> There is strong support from the community to do this, and I'm willing
> to put in the required effort.  I feel very strongly about adding at
> least bignum support to core.  I don't care as much about ratnums and
> I don't care at all about compnums, but it may be simpler to add them;
> the code to support them too is relatively straightforward.

Again, you are basically right, but currently we can distinguish
between numeric types by testing a single bit. Any additional numeric
type will have a performance cost. We also will need a C API to access
bignums, and it's not clear to me how to handle bignums being passed
to foreign functions (simply throw an error?) Many ugly issues are
hiding in the details.

>> * Port-refactoring: again - basically a good idea, but tricky to
>>   design, and may have a large performance impact, and the refactoring
>>   will be work-intensive (all the direct peeking and poking in port
>>   records needs to be localized and changed). This change should also
>>   ideally be considered to be done in tandem with changing the string
>>   representation.
> 
> Here too, a modest change would be enough.  Just using a proper
> struct/record type would make later refactorings easier.  The best
> part is that the performance impact of adding an offset to the write
> buffer is a positive one.  But if we won't be able to make this work,
> I won't be too sad, I promise ;)

Ok, that sounds reaonable.

>> * I think John's idea of putting all the little SRFIs in a few (or a
>>   single) module is better that splitting everything up into
>>   modules. Having modules for each and everything looks nice on paper
>>   but quickly gets old when you have to modify your module imports
>>   every time you use a common but nonstandard language construct.  I
>>   understand that some people like this kind of bureaucracy, but
>>   what's wrong with making things easier for the user?
> 
> Yeah, I said much the same at the start of the section about SRFIs.
> However, I think it *does* make it easier for the user to _also_ offer
> the SRFI libraries separately.  There's already a hacky workaround for
> require-extension's builtin-features in eval.scm so that you can say,
> for example, (require-extension (srfi 2)), so I think it makes sense to
> also provide "full" library declarations, to make it simpler to use and
> write portable R7RS programs.
> 
> Note that this does not mean this needs to be the only library to export
> said SRFI procedures!

Ah, I see. So you mean that we provide multiple modules, then?

> 
>> * Please use long, explicit library names, it's easier to remember
>>   ("there are many ways to abbreviate something, but only one way not
>>   to" - I forgot who said this, John will tell me, I'm sure.) And I
>>   would also suggest to avoid using "srfi-XXX" as a module name, and
>>   to use something meaningful (yes, I know that in the past I was
>>   largely responsible for that mistake in numerous situations.) That
>>   would also allow adding our own extensions.
> 
> For portability, I prefer at least also allowing the srfi numbers.
> But yes, long names are good.  However, there will be so few SRFIs
> that will still be left as part of core that it makes very little
> sense to rename the existing SRFIs, except when grouping several
> constructs together.

Ok.

> 
>> * I can't resist to add a pony on my own: I fear that integrating the
>>   R7RS syntax-rules cleanly and transparently inside an egg will be
>>   tricky. What about changing syntax-rules to have R7RS semantics in
>>   general? I'm not sure if I understand the differences well enough,
>>   perhaps someone (Peter?) can comment on this.
> 
> I think we already did the important bits (ellipsis identifiers and
> tail patterns - ie, SRFI-46).  There are two more changes, AFAIK:
> 
> - The "new" syntax-rules foolishly changed the underscore to act as
>    a wildcard symbol, making it - strictly speaking - incompatible with
>    R5RS.  I don't think it's a good idea to support this in core.
> - For no good reason, R7RS syntax-rules allows not only renaming
>    ellipsis identifiers, but also quoting them (which I think is
>    a bit ugly).  I *think* this is entirely backwards compatible,
>    so we could add that to core.
> 
> This is easily put in the R7RS egg, though.  Remember, any use of
> syntax-rules simply expands into one big ER macro transformer, and it
> is a completely self-contained file which may be taken and copied into
> the R7RS egg, and tweaked there to support these two cases.  But it could
> be simpler to do as a simple preprocessor which generates a "core"
> syntax-rules expansion.

IIRC, full R7RS-compatibility requires "(import-for-syntax (r7rs))" or
something like this. I was wondering about that, since it would be
quite a barrier for portable code to have to take care of this.  Or
can we simply make this implicit in the "define-library" macro?

Or is the incompatibility small enough to be ignored?

> I'd really like to hear other people's ideas about what would be the
> best way to integrate the changes with Henrietta.  Personally, I think
> the easiest way is to simply deploy a second copy of henrietta which
> reads from a different cache, populated by a second henrietta-cache cron
> job which reads from a different master list.

Perhaps also add an entry to egg's "release-info" file for which major
version(s) this release applies?

> Agreed.  How will we attack the problem of bootstrapping?  We will make
> some breaking changes which might mean CHICKEN 4 may be unable to
> bootstrap the CHICKEN-5-in-progress at some point.  Now that we're on
> a separate branch we can't really release snapshots in the 4.9.x series.
> Maybe fall back to a simple date, or git hash versioning scheme for the
> time being?  We don't need to make them public "official" releases of
> course.  I just don't know how well our infrastructure will cope with
> a different naming strategy.  Should we do this by hand?

We can at least tag commits in the repository that are known to work
and we should try to avoid doing such changes as much as possible. If
refactoring and cleanup is the major issue for CHICKEN 5, then I see
no problem here (yet). When it comes to things like FFI-barriers and
internal representation, then we can worry about this. But we have to
see how we get along, it's hard to tell in advance into what issues we
run. Any "major" change should have a verified bootstrap build, from a
known state, completely from sources, ideally from a 4.9
"chicken-boot".


felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]