Re: [Sks-devel] sks-peer.spodhuis.org catching back up

sks-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] sks-peer.spodhuis.org catching back up

From:	Jeffrey Johnson
Subject:	Re: [Sks-devel] sks-peer.spodhuis.org catching back up
Date:	Tue, 29 May 2012 17:42:03 -0400

On May 29, 2012, at 5:29 PM, Phil Pennock wrote:

> On 2012-05-29 at 14:20 -0400, Jeffrey Johnson wrote:
>> Sure there's a way to use Berkeley DB without transactional logs.
> [snip]
> 
> Okay, pedantically you're correct, but since I was trying to *just*
> modify sksclient and not the server, most of what's possible with other
> modes of operation is irrelevant, and obviously so.  The idea was to
> query existing DB files, not spend 10 hours rebuilding.
> 
>> Its also possible to open the databases RDONLY without a dbenv to access 
>> information
>> independently of the actual running daemon, without locks, and without
>> guarantees that the client won't occasionally (and its _VERY__ occasionally
>> because of the sheer number of records and accesses) if a client attempts
>> a read simultaneously with a server attempting a write. (a trickle writer
>> to move cached data in __db* files into the backing store might be needed
>> here: its not hard coding even if difficult QA).
> 
> Ah.  Everything I tried was using a dbenv, as the context for opening
> the db.  I saw nothing in the docs suggesting that opening a db designed
> for use with a dbenv was possible without a dbenv.  *sigh*
> 

Yes its kinda non-obvious (and I'm not sure worthy of documentation).

FWIW, RPM run as non-root doesn't bother with a dbenv (and occasionally
segfaults if/when there is contentious access). The key issue is that a RDONLY
open still requires write access to the dbenv in order to register shared
interprocess locks.

>>> Apparently the locking is not what it could be, as I ended up with a DB
>>> that needed recovery.  Moral 1: do not run sksclient against a serving
>>> DB.
>>> 
>> 
>> There's no engineering diagnosis in your statement that indicates anything
>> other than opinion with
>>      … the locking is not what it could be, …
> 
> For clarity: I meant the locking in how SKS uses BDB is not what it
> could be.  I see that I was ambiguous here and that it's possible to
> interpret it as a general comment on BDB locking; sorry.
> 

np, and certainly nothing personal: just "corruption" is a term I hear at
least weekly ;-)

> Yes, it was opinion, covering the state of affairs.  Since SKS is
> normally going to run as the *only* user of the DB files, and we know
> that even using the "sks dump" command to dump current keys requires
> stopping sks, I stand by my opinion that the locking is currently not
> what it could be.
> 

I'd agree that there is something fishy about how SKS uses BDB in
the PTRee store. Nothing that can't be lived with … but I have seen
too many deadlocks in the PTree database for me to believe "correct".
Note that a "real" fix is a hugely painful amount of QA on a low-incidence
error pathway: the existing incidence of approx 1-2 months between failures
is more than acceptable imho.

>> The recovery was likely needed for other reasons. Note that modern Berkeley 
>> DB
>> supplies DB_RUNRECOVERY during open and its not too hard to automate the 
>> recovery
>> as part of daemon startup, rather than forcing manual intervention.
> 
> How does this help when the problem is a daemon started first, and then
> another tool running which accesses the backing files so that the daemon
> starts PANICing to the logs?
> 

The automation helps with "maintenance". One merely needs to restart
the daemon and DB_RECOVER will be run: no additional manual involvement.

Note that what I'm suggesting is opening a bdb w/o DB_RECOVER usually,
and then retrying the dbenv open with DB_RECOVER when needed. I believe
(but am too lazy to verify) that DB_RECOVER might be sending
all other processes using a dbenv a DB_RUNRECOVERY message
to gain exclusive access while recovering, which might lead to
reports that Berkeley DB is multi-process unfriendly.

I can/will confirm the behavior if needed: dynamic state reproducers are
all that is difficult.

73 de Jeff

[Prev in Thread]

Current Thread

[Next in Thread]

[Sks-devel] sks-peer.spodhuis.org catching back up, Phil Pennock, 2012/05/29
- Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jeffrey Johnson, 2012/05/29
  - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jeffrey Johnson, 2012/05/29
  - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Phil Pennock, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jeffrey Johnson <=
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Kristian Fiskerstrand, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jason Harris, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jeffrey Johnson, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Phil Pennock, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, David Benfell, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Phil Pennock, 2012/05/29
    - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, David Benfell, 2012/05/30
- Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Kristian Fiskerstrand, 2012/05/29
- Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Kim Minh Kaplan, 2012/05/29
  - Re: [Sks-devel] sks-peer.spodhuis.org catching back up, Jeffrey Johnson, 2012/05/29

Prev by Date: Re: [Sks-devel] sks-peer.spodhuis.org catching back up
Next by Date: Re: [Sks-devel] sks-peer.spodhuis.org catching back up
Previous by thread: Re: [Sks-devel] sks-peer.spodhuis.org catching back up
Next by thread: Re: [Sks-devel] sks-peer.spodhuis.org catching back up
Index(es):
- Date
- Thread