guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Guix infrastructure


From: myglc2
Subject: Re: Guix infrastructure
Date: Sat, 08 Jul 2017 20:43:42 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

On 07/07/2017 at 14:19 Ludovic Courtès writes:

> Hello Guix!
>
> Leo Famulari <address@hidden> skribis:
>
>> On Thu, Jul 06, 2017 at 08:09:17PM -0400, myglc2 wrote:
>>> On 07/01/2017 at 14:01 Leo Famulari writes:
>>> > ... Bayfront is still not fully operational, so hydra.gnu.org is still
>>> > serving as the front-end of the build farm. We are still relying on the
>>> > Hydra software. That is, the situation is basically the same as before.
>>> > Adding build machines will not help very much until the front-end
>>> > hardware gets faster.
>>> 
>>> This leaves me wondering ...
>>> 
>>> Is the hydra/front-end hardware going to be upgraded?
>>
>> Yes...
>>
>>> Is bayfront/cuirass intended to replace hydra?
>>
>> ... and yes.
>>
>>> The bayfront hardware described here ...
>>> 
>>> https://www.gnu.org/software/guix/news/growing-our-build-farm.html
>>> 
>>> ... seems weak to me. Is there a plan to scale it up and make it redundant?
>>
>> It will be a lot more powerful than the current Hydra system. As for
>> specific plans, I'll let those administering the system chime in.
>
> That machine is super powerful…

Well, I disagree. A 2010 motherboard with 2 x 2011 CPUs (16 core at
1.6GHz) is weak compared to modern servers. 

> but alas, it has also been terribly
> unstable.  Vikings has been kind enough to assist us; they’ve notably
> provided us with CPU replacements once already.  Despite these efforts,
> the machine is still crashing.  We’re investigating with them what to
> do next.
>
> On top of that, all the testing and all the back and forth takes an
> awful lot of time, which is in part due to hardware problems being hard
> to pinpoint and debug in general, and in part due to us here in Bordeaux
> (where the machine is hosted) being unable to scale up.

As you have experienced here, the learning/deployment costs and hassle
associated with each new type of server often dwarfs other costs. The
best way to minimize this is to minimize the number of types of servers
you own. In practice this means you need to place your bets carefully
and quickly cut you loses if things don't work out.  It also means that
when (not if) a server breaks you should buy one exactly like the broken
one.

At this point it makes sense to abandon the Vikings motherboard and
choose a popular, mainstream, current x86_64 motherboard. Since AMD has
not been a competitive server vendor for the last ~8 years this means,
practically speaking, picking a popular intel-based motherboard.

> Infrastructure has been the project’s Achilles’ heel since we run the
> crowdfunding campaign in Dec. 2015 (!).  Now it’s becoming detrimental
> to the project.  Our initial plan was to buy more Libreboot-based
> machines like the one above once the first one has proved to work
> well.  However, given the situation, we’ve been discussing on
> guix-sysadmin a change in strategy, at least in the short term;
> Ricardo has been working on re-purposing used hardware for our needs,
> and that may well be our short- to medium-term solution.

Hmm, didn't know about guix-sysadmin until now but couldn't easily read
it. So, FWIW, here are some additional comments/suggestions ...

It should be pretty easy to estimate the requirements to run the front
end and do a nightly x86_64 build of guixSD, projected out 3 years. Do
we have a handle on what this is?

You should buy hardware that supports this and plan to discard it in 3
years.

Since things always break, visualize a system in which every server has
a redundant warm backup or is a pair of servers at 50% load.

You can choose between amazingly cheap used servers that guzzle power or
new servers that use less power. If you buy used computers ~ 3 years old
the total cost of ownership will be nearly a wash over ~ 3 years of
deployment. So, if you are cash rich, buy new computers. If you are
cash-poor, buy used computers, but don't buy anything more than 3 years
old, unless you want a computer museum ;-)

The benefit of a used server is that it comes assembled and tested and
probably has good driver support. Shiny new motherboards expose you to
the risk of unstable drivers and BIOS.  So, if you want new, you should
probably buy last year's model ;-)

In either case, it is tempting to assemble the computer. But this is not
a good idea because there is always some glitch.

The best strategy would be to buy assembled, tested servers with GuixSD
installed and running. If you can't find a vendor that will do that for
you, buy a test unit on the condition that the machine will be returned
if GuixSD doesn't install smoothly.

Specify RAIDed SSDs or, ideally, NVMe drives.

Acquire a test unit quickly. If it works great, buy another one (or
two)! If not, ditch it right away and try something else. Based on my
experience it is easy to install GuixSD on 3 year old intel-based
hardware.  And I expect it should be equally easy to install on 1 year
old intel-based hardware.

Finally, WRT the bayfront hardware, when you said, "That machine is
super powerful…" I guess this is relative to the existing hydra front
end and on the assumption that it is only used as a front end.

But this raises the obvious question, isn't it possible to specify a
bayfront sever that can also build "all of x86_64 Guix" in a day?  If
so, this would be simpler, wouldn't it? Then, if/when it becomes
overloaded you can supplement it with x86_64 build machines and "proxy"
front ends, can't you?

> For now, we prefer not to entrust the binaries we deliver to
> commercial VPS providers.  I think we owe it to our users, but it
> undoubtedly has a cost in terms of system maintenance.

Owning servers gives you more control and saves you time and money over
the long term. It will also improve the quality of the GuixSD "product"
offering for servers. So, IMO this is the right thing to do at this
time.

But it would dramatically legitimize GuixSD for some people if it were
also deployed on AWS. So this would probably be a good thing to
visualize for the future.

HTH - George



reply via email to

[Prev in Thread] Current Thread [Next in Thread]