guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] emacs: Rewrite scheme side in a functional manner.


From: Alex Kost
Subject: Re: [PATCH] emacs: Rewrite scheme side in a functional manner.
Date: Sun, 21 Sep 2014 14:51:18 +0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Ludovic Courtès (2014-09-20 18:11 +0400) wrote:

> Alex Kost <address@hidden> skribis:
>
>> From d42829fe03271e633e43cc35cf277705203e6080 Mon Sep 17 00:00:00 2001
>> From: Alex Kost <address@hidden>
>> Date: Thu, 18 Sep 2014 16:24:02 +0400
>> Subject: [PATCH 2/3] emacs: Rewrite scheme side in a functional manner.
>
> Good idea!

It was your idea actually :-)

> There are still a bunch of ‘set!’ and hash tables, though.  ;-)

There are only 2 ‘set!’s in:

--8<---------------cut here---------------start------------->8---
(define manifest->hash-table
  (let ((current-manifest #f)
        (current-table #f))
    (lambda (manifest)
      "Return hash table of name keys and lists of matching MANIFEST entries."
      (unless (manifest=? manifest current-manifest)
        (set! current-manifest manifest)
        (set! current-table (mentries->hash-table
                             (manifest-entries manifest))))
      current-table)))
--8<---------------cut here---------------end--------------->8---

and I think they are unavoidable.

As for the hash tables, I use them because I don't see how they can be
removed without a loss in efficiency.

> Overall it looks OK.  I’m concerned with code duplication between emacs/
> and guix/, though, and there are also a few stylistic issues, and
> missing or terse docstrings.

I don't understand what duplication you mean.

I didn't write much docstrings (but I'll add more), because:

- Most functions are trivial and self-descriptive by their names.
- All functions are used only inside “guix-main.scm” anyway.

> Some remarks:
>
>> +(define (mentries->hash-table mentries)
>
> For consistency I’d make it ‘manifest-entries->hash-table entries’.

OK, I'll rename "mentries" to "manifest-entries" in the procedures'
names, but I leave "mentry"/"mentries" for the local variables if you
don't mind.

The thing is, I use the term "entry" to name an alist of
package/output/generation parameters that is passed to the elisp side.
It looks like this:

((name . "guile")
 (version . "2.0.11")
 (outputs "out" "debug")
 ...)

So I shortened "manifest-entry" to "mentry" to distinguish these
entities.

>> +(define manifest->hash-table
>> +  (let ((current-manifest #f)
>> +        (current-table #f))
>> +    (lambda (manifest)
>> +      "Return hash table of name keys and lists of matching MANIFEST 
>> entries."
>> +      (unless (manifest=? manifest current-manifest)
>> +        (set! current-manifest manifest)
>> +        (set! current-table (mentries->hash-table
>> +                             (manifest-entries manifest))))
>> +      current-table)))
>
> What about:
>
>   (define manifest->hash-table
>     (memoize
>       (compose manifest-entries->hash-table
>                manifest-entries)))

I should have written that I tried to use ‘memoize’ there but it was
significantly slower.  I tried a test manifest of 1200 entries and on my
(very slow) computer it took in average:

- about 3 seconds for the variant I suggest now;
- about 15 seconds for the variant with ‘memoize’

to get a list buffer with all packages.

I think that happens because hash table procedures in ‘memoize’ use
‘equal?’ to compare arguments.  And ‘equal?’-ing big manifests for every
package is slow.  That's why I made ‘manifest?=’ with ‘eq?’ at first
place.

Here is what happens when information about all packages is "requested":
we need to check every package if it's installed or not, i.e. to lookup
the same manifest for each package.  I don't see a better way then using
an additional hash table.

> But honestly I think this is premature optimization (I mean, there are
> 177 packages in my profile, so 10,000 ;-)), and should that optimization
> be needed, it should be done transparently in (guix profiles), as we
> discussed some time ago.

I think this optimization is premature for putting it into (guix
profiles) but I believe it is necessary for the special needs of
"guix.el".

Sorry, I don't understand what «10,000» means.

We discussed using vhash, but I don't see how it can replace hash
table.  Here is the excerpt from the earlier message:

————————————————————————————————————————————————————————————————
Ludovic Courtès (2014-09-03 11:09 +0400) wrote:

> Alex Kost <address@hidden> skribis:
>
>> Ludovic Courtès (2014-09-01 16:10 +0400) wrote:
>
> [...]
>
>>> Would ‘vlist-fold’ work?
>>>
>>> scheme@(guile-user)> (vhash-cons 'a 1 (vhash-cons 'b 2 (vhash-cons 'a 3 
>>> vlist-null)))
>>> $2 = #<vhash 26fd3a0 3 pairs>
>>> scheme@(guile-user)> (vlist-fold cons '() $2)
>>> $3 = ((a . 3) (b . 2) (a . 1))
>>
>> Sorry, I don't see how it could work.  Here is an example:
>>
>> ;; What I currently have is a hash-table like this one:
>> (define table (make-hash-table 3))
>>
>> (hash-set! table 'a '(1 2 3))
>> (hash-set! table 'b '(4))
>> (hash-set! table 'c '(5 6))
>>
>> ;; And I can easily fold through unique keys like this:
>> (hash-fold (lambda (key entries res)
>>              (cons (cons key (apply + entries)) res))
>>            '()
>>            table) ; => ((c . 11) (b . 4) (a . 6))
>>
>> ;; What you suggest is a vhash like this:
>> (define vhash
>>   (vhash-cons
>>    'a 1
>>    (vhash-cons
>>     'a 2
>>     (vhash-cons
>>      'a 3
>>      (vhash-cons
>>       'b 4
>>       (vhash-cons
>>        'c 5
>>        (vhash-cons
>>         'c 6 vlist-null)))))))
>>
>> ;; But how can I fold through unique keys there?
>
> With a vhash, you can’t really do that efficiently.  ‘vlist-fold’ allows
> you to iterate over the list of key/value pairs, but in the order in
> which they appear (so you don’t get all the 'a, and then all the 'b.)
>
> If that’s not sufficient, then yes, hash tables may be best.
————————————————————————————————————————————————————————————————

>> +(define* (mentries-by-name manifest name #:optional version output)
>> +  "Return list of MANIFEST entries matching NAME, VERSION and OUTPUT."
>> +  (let ((mentries (or (hash-ref (manifest->hash-table manifest) name)
>> +                      '())))
>> +    (if (or version output)
>> +        (filter (lambda (mentry)
>> +                  (and (or (not version)
>> +                           (equal? version (manifest-entry-version mentry)))
>> +                       (or (not output)
>> +                           (equal? output  (manifest-entry-output 
>> mentry)))))
>> +                mentries)
>> +        mentries)))
>
> What about using ‘manifest-lookup’ instead?

It is slower (having in mind that this function is called for every
package).  With the suggested variant I can immediately narrow the
search to entries with the same name.

>> +(define (mentry-by-output mentries output)
>> +  (find (lambda (mentry)
>> +          (string= output (manifest-entry-output mentry)))
>> +        mentries))
>
> Likewise.

I can't use it here, because ‘manifest-lookup’ needs manifest, and here
manifest-entries are passed to this function.  And these are not all
entries of some manifest.  Currently this function is used to find a
suitable entry among entries with the same name.

>>  (define* (object-transformer param-alist #:optional (params '()))
>> -  "Return function for transforming an object into alist of 
>> parameters/values.
>> +  "Return function for transforming objects into alist of parameters/values.
>
> “Return a procedure transforming an object into an list of
> parameter/value pairs.”
>
>> -    (lambda (object)
>> +    (lambda objects
>>        (map (match-lambda
>>              ((param . fun)
>> -             (cons param (fun object))))
>> +             (cons param (apply fun objects))))
>>             alist))))
>
> s/fun/proc/ (yeah, this is Scheme. ;-))

OK, I didn't know that there is a strong convention to prefer
"procedure" over "function" in Scheme.

> May be worth considering moving it to (guix records), which already has
> ‘alist->record’.

Not sure, it seems too specific for me.

>> +(define (spec->package-pattern spec)
>> +  (call-with-values
>> +      (lambda () (full-name->name+version spec))
>> +    list))
>
> s/spec/specification/g

OK.

> However, the result is not a “package pattern”, right?  I find the name
> a bit confusing.

Actually it IS a package pattern (see the comment below).  I realize
that "pattern" is a bad name (especially taking into account that
"pattern" in processing package actions is a totally different thing),
but I couldn't invent a better term.

> Is there any reason to use lists instead of multiple values?

Yes.

>> +(define (manifest-package-patterns manifest)
>> +  "Return list of package patterns for all MANIFEST entries."
>> +  (fold-manifest-by-name manifest
>> +                         (lambda (name version mentries res)
>> +                           (cons (list name version mentries) res))
>> +                         '()))
>
> Likewise I’m unclear why this “package pattern” abstraction is needed
> here.
>
> Actually, is it just for serialization between Guile and Emacs?  If that
> is the case, then ‘package->sexp’ would seem more appropriate.
>
>> +(define (package-pattern-transformer manifest params)
>> +  "Return 'package-pattern->package-entries' function."
>
> Damn, what does this mean?  :-)

Here is what that all means:

To get information about packages/outputs, different “search-types” are
used (like ‘name’, ‘all-available’, ‘installed’).  These “search-types +
search-vals” are transformed into a list of package/output patterns.

Package patterns are:

- package record;
- list of name, version;
- list of name, version, manifest entries;
- list of name, version, manifest entries, packages.

Output patterns are:

- package record;
- list of package record, output;
- manifest entry record;
- list of manifest entry record, packages;
- list of name, version, output.

Then these patterns are transformed into entries (alists with
parameters/values) using “pattern->entries” functions created by
‘package-pattern-transformer’ and ‘output-pattern-transformer’.

I find that this is the most clear way to get required information.
When I tried to make it straightforward (i.e. without using "patterns"
as intermediate), it lead me to duplicating code here and there, and now
the most part of getting information is concentrated in 1 (I mean 2)
place: ‘package-pattern-transformer’ and ‘output-pattern-transformer’,
while the other code became really simple.

I tried my best, but I realize that there may be better ways to do
things, so I don't mind if someone rewrite the scheme side for receiving
information about packages/outputs/generations in a more clear and
efficient way.

>> +(define (get-package/output-entries profile params entry-type
>> +                                    search-type search-vals)
>> +  "Return list of package or output entries."
>
> Never ‘get’.

Do you mean I need to remove "get-" from the procedures' names?  What
about ‘get-entries’ (it is the entry point for the elisp side) – should
it be just ‘entries’ then?

> The docstring is too terse.

The concept of "entry" is too common for the code in “guix-main.scm” to
write about it in every docstring.

>> +;;; XXX move to (guix profiles) ?
>>  (define (profile-generations profile)
>
> Definitely worth moving there (in a separate patch.)

What about pushing the following attached commits now (before the above
changes)?:

Attachment: 0001-profiles-Add-profile-generations.patch
Description: Text Data

Attachment: 0002-guix-package-Use-profile-generations.patch
Description: Text Data

————————————————————————————————————————————————————————————————
P.S.  I've just reread this mail and found that it may look rather
offensive.  Sorry, I didn't mean to do that.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]