[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting
From: |
Phil Pennock |
Subject: |
Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting |
Date: |
Mon, 25 Jun 2018 17:46:37 -0400 |
On 2018-06-25 at 13:08 +0200, Paul Fontela wrote:
> I have tried almost everything, from downloading a dump and starting the
> server sks again to reinstall system and everything else, the result is
> always the same, it works well for a while, sometimes an hour sometimes
> a little more and suddenly it it freezes the key server, reaching 80%
> RAM, which makes it unstable and inoperable.
That sounds like recon gone wild, normally a sign that you're peering
with someone who is very much behind on keys. The recon system only
works if your peers are "mostly up-to-date".
This is why we introduced the template for introducing yourself to the
community, in the Peering wiki page, showing how many keys you have
loaded. It cut down on people joining with 0 keys, expecting recon to
do all the work, and new peers complaining that their SKS was hanging.
Per <https://sks-keyservers.net/status/> the lower bound of keys to be
included is: 5105570
You have: 5109664
Using <http://keyserver.ispfontela.es:11371/pks/lookup?op=stats> as a
starting point, and skipping your in-house 11380 peers, opening all the
others up in tabs and looking (I don't have this scripted) we see:
5109604 keys.niif.hu
5065412 keys.sbell.io
5107576 sks.mbk-lab.ru
5109585 pgp.neopost.com
5108773 pgp.uni-mainz.de
5109639 pgpkeys.urown.net
4825075 pgp.key-server.io
<can't connect> sks.funkymonkey.org
5084241 keyserver.iseclib.ru
5109254 keyserver.swabian.net
5109628 sks-cmh.semperen.com
<sks down behind proxy> keys-02.licoho.de
5109629 keyserver.dobrev.eu
5109121 sks.mirror.square-r00t.net
5109629 keyserver.escomposlinux.org
5108778 keyserver.lohn24-datenschutz.de
If your in-house peers are way behind, fix that.
Comment out all peers with fewer than 5_100_000 keys. Restart sks and
sks-recon.
The 284,000 key difference is pretty severe. Since that peer isn't
getting updates, they're probably hanging on peering and causing even
more problems for you.
Disable peering _at least_ with those three hosts.
Whenever SKS isn't performing right, the _first_ step after looking for
errors in logs should always be a Peering Hygiene Audit. Find the peers
who are sufficiently behind that their keeping the peering up is
anti-social and likely causing _you_ problems, comment out the peering
entries, restart (for a completely clean slate) and then reach out to
those peers to ask "Hey, what's up?".
Regards,
-Phil
- Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting, (continued)
Re: [Sks-devel] SKS intermittently stalls with 100% CPU & rate-limiting, Paul Fontela, 2018/06/25