bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#47283: Performance regression in narinfo fetching


From: Christopher Baines
Subject: bug#47283: Performance regression in narinfo fetching
Date: Tue, 23 Mar 2021 20:47:12 +0000
User-agent: mu4e 1.4.15; emacs 27.1

Ludovic Courtès <ludo@gnu.org> writes:

> Christopher Baines <mail@cbaines.net> skribis:
>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>>> Indeed, there’s one place on the hot path where we install exception
>>> handlers: in ‘http-multiple-get’ (from commit
>>> 205833b72c5517915a47a50dbe28e7024dc74e57).  I don’t think it’s needed,
>>> is it?  (But if it is, let’s find another approach, this one is
>>> prohibitively expensive.)
>>
>> I think the exception handling has moved around, but I guess the
>> exceptions that could be caught in http-multiple-get could happen,
>> right? I am really just guessing here, as Guile doesn't help tell you
>> about possible exceptions, and I haven't spent enough time to read all
>> the possible code involved to find out if these are definitely possible.
>
> Yeah.
>
> Commit 205833b72c5517915a47a50dbe28e7024dc74e57 added a ‘catch’ block
> that catches the same things as ‘with-cached-connection’ did (it would
> be better to not duplicate it IMO).  That includes EPIPE, gnutls-error,
> bad-response & co.

So, my intention here was to move the error handling, to allow
separating out the connection caching code from the code I wanted to
move out to the (guix substitutes) module. I don't think there's
currently duplication in the error handling for the code path involving
http-multiple-get currently, at least for the exceptions in question
here.

> Earlier, commit be5a75ebb5988b87b2392e2113f6590f353dd6cd (“substitute:
> Reuse connections for '--query'.”) did not add such a ‘catch’ block in
> ‘http-multiple-get’.  Instead, it wrapped its call in ‘do-fetch’ in
> ‘fetch-narinfos’:
>
>    (define (do-fetch uri)
>      (case (and=> uri uri-scheme)
>        ((http https)
> -       (let ((requests (map (cut narinfo-request url <>) paths)))
> -         (match (open-connection-for-uri/maybe uri)
> -           (#f
> -            '())
> -           (port
> -            (update-progress!)
>         ;; Note: Do not check HTTPS server certificates to avoid depending
>         ;; on the X.509 PKI.  We can do it because we authenticate
>         ;; narinfos, which provides a much stronger guarantee.
> -            (let ((result (http-multiple-get uri
> +       (let* ((requests (map (cut narinfo-request url <>) paths))
> +              (result   (call-with-cached-connection uri
> +                          (lambda (port)
> +                            (if port
> +                                (begin
> +                                  (update-progress!)
> +                                  (http-multiple-get uri
>                                                       handle-narinfo-response 
> '()
>                                                       requests
> +                                                     #:open-connection
> +                                                     
> open-connection-for-uri/cached
>                                                       #:verify-certificate? #f
> -                                             #:port port)))
>
> This bit is still there in current ‘master’, so I think it’s not
> necessary to catch these exceptions in ‘http-multiple-get’ itself, and I
> would just remove the ‘catch’ wrap altogether.
>
> WDYT?

I'm not sure what you're referring to as still being there on the master
branch?

Looking at the changes to this particular code path resulting from the
changes I've made recently, starting at lookup-narinfos, before:

 - lookup-narinfos calls fetch-narinfos, which calls do-fetch

 - call-with-cached-connection is used, which catches a number of
   exceptions relating to requests, and will retry PROC once upon a
   matching exception

 - open-connection-for-uri/maybe is also used, which is like
   open-connection-for-uri/cached, except it includes error handling for
   establishing connections to substitute servers

 - http-multiple-get doesn't include error handling

After:

 - lookup-narinfos calls fetch-narinfos, which calls do-fetch

 - call-with-connection-error-handling is used, which performs the same
   role as the error handling previously within
   open-connection-for-uri/maybe, catching exceptions relating to
   establishing connections to substitute servers

 - http-multiple-get now includes error handling similar to what was
   previously done by call-with-cached-connection, although it's more
   complicated since it's done with knowledge of what http-multiple-get
   is doing

I think that the error handling now in http-multiple-get isn't covered
elsewhere. Moving this error handling back in to fetch-narinfos is
possible, but then we'd be back to handling connection caching in that
code, and avoiding that led to this refactoring in the first place.

Also, apart from the implementation problems, I do think that the error
handling here is better than before. Previously, if you called
lookup-narinfos, and a connection problem occurred, processing all the
requests would start from scratch (as call-with-cached-connection calls
PROC a second time), if a second connection error was to happen, well,
call-with-cached-connection only handles one error, so that won't be
caught.

I think it's possible that http-multiple-get will be making thousands of
requests, running guix weather with no cached results for example. The
error handling in http-multiple-get is smarter than the previous
approach, doing as little as possible again. It's also not limited to
catching one exception.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]