guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] Support publishing build events


From: Christopher Baines
Subject: Re: [PATCH 1/2] Support publishing build events
Date: Tue, 03 Dec 2019 01:12:47 +0100
User-agent: mu4e 1.2.0; emacs 26.3

Clément Lassieur <address@hidden> writes:

> Hi Christopher,
>
> This is a small review :)
>
> Christopher Baines <address@hidden> writes:
>
>> Add a table to store events, which have a type and a JSON blob. These can be
>> used to record changes, this commit inserts events when new builds are
>> created, and when the status of builds change.
>
> Why is it a JSON blob?  I mean, why do we need it to be JSON if we use
> JSON-STRING->SCM everytime we need to access it?  If it's just to
> normalize the scheme object, I think it's already done by %SQLITE-EXEC.
> It converts the scheme object to a string.  I can't see how a JSON
> object is better than a string.

I guess so… I was probably just thinking about JSON from JSON fields
in PostgreSQL, but we're not using PostgreSQL here.

The one advantage I can think of is that not every Scheme object can be
converted to JSON (at least with guile-json), so by storing the JSON
representation, you're guaranteed to have a valid value. If you stored a
scheme object, there's a chance that you could store something that was
invalid, or at least didn't convert to JSON.

>> The EventsOutbox table is then used to track when events have been sent
>> out. This is done through the new cuirass-send-events script.
>
> So the Events table will grow a lot.  And once its content has been sent
> out, we don't use it anymore if I understand well.  Would it be possible
> to just use the Events table, that would be emptied upon successful
> send?  That way, the whole mechanism would be simpler, and that table
> wouldn't grow.
>
> Unless there is a need to fetch events that have already been sent out?

There are API endpoints that expose events, like /api/build-events,
which requires storing the events.

> But in that case, why not fetch them from the Guix Data Service?

I'm still not sure what the ideal scope and architecture for the Guix
Data Service, or Cuirass is, but at the moment I've been trying to avoid
making one dependent on the other.

Also, I've been thinking about WebSub when trying to provide a way to
subscribe to events, which I'll say more about shortly…

> Now that I think about it, I don't understand the point the HTTP API
> ("build-events", "evaluation-events").  If events are sent from Cuirass
> to the Guix Data Service, why would anyone need to fetch those events
> from Cuirass?  Wouldn't it make more sense to fetch them from the Guix
> Data Service?

So, while this work is trying to copy data from Cuirass to the Guix Data
Service, I'm still thinking about them separately. I guess if you're
running your own instance of Cuirass, you might want to fetch events
from it, without having to setup an instance of the Guix Data Service.

> Another thing: Cuirass instances that aren't bound to the Guix Data
> Service will still have their Events table grow.  So it would be great
> if DB-ADD-EVENT can be enabled through a Cuirass parameter.  And by
> default it would be disabled.

I think that's a good idea, I'll have a look at implementing it.

>> * Makefile.am (bin_SCRIPTS): Add bin/cuirass-send-events.
>                                              ^
> Could you please add it to .gitignore?

Ah, yep, will do.

>> +;;;
>> +;;; Entry point.
>> +;;;
>> +
>> +(define* (main #:optional (args (command-line)))
>> +
>> +  ;; Always have stdout/stderr line-buffered.
>> +  (setvbuf (current-output-port) 'line)
>> +  (setvbuf (current-error-port) 'line)
>> +
>> +  (let ((opts (getopt-long args %options)))
>> +    (parameterize
>> +        ((%program-name     (car args))
>> +         (%package-database (option-ref opts 'database (%package-database)))
>> +         (%package-cachedir
>> +          (option-ref opts 'cache-directory (%package-cachedir))))
>> +      (cond
>> +       ((option-ref opts 'help #f)
>> +        (show-help)
>> +        (exit 0))
>> +       (else
>> +        (run-fibers
>                ^
> Why do we need to use a fiber, if there is only one?  A fiber is a
> lightweight thread.  They are useful if we need to do several things in
> the same time, but if there is only one thing to do, I don't understand
> the point.

If I remember correctly, the way Cuirass accesses the database is tied
up with fibers, so it's to make the database access work here.

>> diff --git a/src/cuirass/base.scm b/src/cuirass/base.scm
>> index 143bc2e..e7c2597 100644
>> --- a/src/cuirass/base.scm
>> +++ b/src/cuirass/base.scm
>> @@ -670,7 +670,14 @@ started)."
>>                       (#:timestamp . ,cur-time)
>>                       (#:starttime . 0)
>>                       (#:stoptime . 0))))
>> -        (db-add-build build))))
>> +        (if (db-add-build build)
>> +            (begin
>> +              (db-add-event 'build
>> +                            cur-time
>> +                            `((#:derivation . ,drv)
>> +                              (#:event       . scheduled)))
>> +              drv)
>> +            #f))))
>
> Could this be moved into the DB-ADD-BUILD procedure, in database.scm?
> This way everyting would be done at the same (lower) level.  The higher
> level base.scm module doesn't need to know about it.  And it'd adds
> consistency.

I'll have a look at doing this, along with storing events configurable.

>>    (define derivations
>>      (filter-map register jobs))
>> diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
>> index 523165d..8cb7465 100644
>> --- a/src/cuirass/database.scm
>> +++ b/src/cuirass/database.scm
>
> [...]
>
>> +          (unless (eq? (changes-count db) 0)
>                            ^
>              (when (positive? (changes-count db))

Cool, that does look better :)

>> diff --git a/src/cuirass/send-events.scm b/src/cuirass/send-events.scm
>> new file mode 100644
>> index 0000000..2b7dd9c
>> --- /dev/null
>> +++ b/src/cuirass/send-events.scm
>> @@ -0,0 +1,91 @@
>
> [...]
>
>> +(define* (send-events target-url
>> +                      #:key (batch-limit 100))
>> +  "Send up to BATCH-LIMIT events to TARGET-URL"
>> +  (with-exponential-backoff-upon-error
>> +   (lambda ()
>> +     (let ((events-to-send
>> +            (db-get-events-in-outbox batch-limit)))
>> +       (unless (null? events-to-send)
>> +         (let* ((body
>               ^
>              let

Ah, good spot.

>> +                 (object->json-string
>> +                  `((items
>> +                     . ,(list->vector
>> +                         (map (lambda (event)
>> +                                (let ((event-json
>> +                                       (json-string->scm
>
> [...]
>
>> +(define* (with-exponential-backoff-upon-error f #:key (retry-number 1))
>> +  "Run F and catch exceptions, retrying after a number of seconds that
>> +increases exponentially."
>
> Here, could you use common Guile conventions to write the docstring?
> There are nice examples in ports.scm[1].  F would be a THUNK for
> example.  And "call" would be used instead of "run".

Ok, I'll have a look at doing this.

> Also, a small wrapper macro (as it is done with (@@ (guix gnu machine
> ssh) with-roll-back)) would allow to avoid typing (lambda () ...) and
> remove an indentation level.  You could prefix the procedure with
> 'call-', and add something like:
>
> (define-syntax-rule (with-exponential-backoff-upon-error body ...)
>   (call-with-exponential-backoff-upon-error
>    (lambda ()
>      body ...)))

I'm not that comfortable with macros, but I'll have a look at doing
this.

Thanks,

Chris

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]