Re: [Sks-devel] SKS Performance oddity

sks-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] SKS Performance oddity

From:	Michiel van Baak
Subject:	Re: [Sks-devel] SKS Performance oddity
Date:	Sat, 9 Mar 2019 11:29:14 +0100
User-agent:	NeoMutt/20180716

On Sat, Mar 09, 2019 at 12:22:14AM -0500, Jeremy T. Bouse wrote:
>       I don't know what is going on here with my cluster but I have 3 of 4
> nodes that absolutely perform as I would expect... They have 2 vCPU
> with 4GB RAM each along with an extra 50GB drive exclusively for SKS
> use under /var/lib/sks. The three behaving fine are my sks02, sks03
> and sks04 secondary nodes. My primary node on the other hand is
> another story. First I tried increasing it from 2 vCPU/4GB RAM like
> the others to 2 vCPU/8GB RAM and then 4 vCPU/8GB RAM without it making
> any change. I then built out a new physical server with a quad-core
> Xeon 2.4GHz processor and 4GB RAM and a dedicated 3TB RAID5 array and
> I'm seeing the same problem. SKS is constantly pegging the CPU at 100%
> and eating up nearly all the memory whether it's running on a virtual
> or physical. server. Recon service is working and I'm ingesting keys
> from peers and peering with my internal cluster nodes but everytime it
> goes into recon mode the node starts failing to respond as the CPU and
> RAM spike which then leads to the node being dropped from the pool as
> the stats page can't be hit before it times out.
> 
>       I've been fighting with this for a several days now... Anyone else
> out there seeing this behavior or if not and have similar resourced
> servers care to share details to see if I'm missing something here.
> 
>       The particulars are that all nodes are Debian 9.8 (Stretch) 64-bit.
> Then only primary node handles running NGINX configured for load
> balancing the cluster. The only other daemons running across all nodes
> besides SKS are OpenSSH for remote access, SSSD for centralized
> authenication, Haveged for entropy and Postfix configured for
> smarthost relaying.

Hey,

I hav exactly the same problem.
Several times in the last month I have done the following steps:

- Stop all nodes
- Destroy the datasets (both db and ptree)
- Load in a new dump from max 2 days old
- Create the ptree database
- Start sks on the primary node, without peering configured (comment out
  all peers)
- Give it some time to start
- Check the stats page and run a couple of searches
# Up until here everything works fine #
- Add the outside peers on the primary node and restart it
- After 5 minutes the machine takes 100% CPU, is stuck in I/O most of
  the time and falls off the grid

It doesn't matter if I enable peering with the internal nodes or not.
Just having 1 SKS instance running, and peering it with the network is
enough to basically render this instance unusable.

Like you, I tried in a vm first, and also on a physical machine (dual
6-core xeon E5-2620 0 @ 2.00GHz with 96GB ram and 2 samsung evo 840 pro
ssds for storage)
I see exactly the same every time I follow the steps outlined above.

The systems I tried are Debian linux and FreeBSD and all the same.

-- 
Michiel van Baak
address@hidden
GPG key: http://pgp.mit.edu/pks/lookup?op=get&search=0x6FFC75A2679ED069

NB: I have a new GPG key. Old one revoked and revoked key updated on keyservers.

[Prev in Thread]

Current Thread

[Next in Thread]

[Sks-devel] SKS Performance oddity, Jeremy T. Bouse, 2019/03/09
- Re: [Sks-devel] SKS Performance oddity, Todd Fleisher, 2019/03/09
- Re: [Sks-devel] SKS Performance oddity, Michiel van Baak <=
  - Re: [Sks-devel] SKS Performance oddity, Jeremy T. Bouse, 2019/03/09
- Re: [Sks-devel] SKS Performance oddity, Jim Popovitch, 2019/03/09

Prev by Date: Re: [Sks-devel] SKS Performance oddity
Next by Date: Re: [Sks-devel] SKS Performance oddity
Previous by thread: Re: [Sks-devel] SKS Performance oddity
Next by thread: Re: [Sks-devel] SKS Performance oddity
Index(es):
- Date
- Thread