sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Sks-devel] more database corruption


From: Dan Egli
Subject: [Sks-devel] more database corruption
Date: Sun, 2 Nov 2003 12:52:47 -0700 (MST)

This is getting annoying. I looked over the server today and saw a lot of 
messages (litterally thousands) in the failed_messages dir. That made no 
sense so I moved some of them into the messages folder. They came right 
back. That is strange, so I looked at the log. Database is corrupted 
AGAIN. It seems to have happened at 3:00am this morning. Observe:

2003-11-02 02:48:14 1 keys found
2003-11-02 02:48:41 Adding list of 1 keys from file 
./messages/msg-38760868.ready
2003-11-02 02:48:41 Applying 0 changes
2003-11-02 02:49:11 Adding list of 1 keys from file 
./messages/msg-10140124.ready
2003-11-02 02:49:11 Applying 0 changes
2003-11-02 02:59:52 Adding list of 1 keys from file 
./messages/msg-64306034.ready
2003-11-02 02:59:52 Applying 2 changes
2003-11-02 02:59:52 Adding hash 7B669B52ADB3D241956246551256B1F0
2003-11-02 02:59:52 Del'ng hash C9D6D9AC14E0AA17B32726433E2EEA32
2003-11-02 02:59:56 Sending LogResp size 2
2003-11-02 03:00:00 Calculating DB stats
2003-11-02 03:00:05 eventloop: Bdb.DBError("fatal region error detected; 
run recovery")
2003-11-02 03:00:05 <command handler> error in callback.: 
Bdb.DBError("fatal region error detected; run recovery")
2003-11-02 03:00:09 <mail transmit keys> error in callback.: 
Bdb.DBError("fatal region error detected; run recovery")
2003-11-02 03:00:10 <command handler> error in callback.: 
Bdb.DBError("fatal region error detected; run recovery")
2003-11-02 03:00:13 Error fetching key from hash 
7B669B52ADB3D241956246551256B1F0: Bdb.DBError("fatal region error 
detected; ru\n recovery")
2003-11-02 03:00:13 0 keys found

I tried to think what could be happening at 3:00 am that could corrupt the 
database. The only thing I can come up with is my automatic backup and 
keydump script. But I cannot see how it would affect the main database 
because the only operations that occur in the main database are db_archve 
and db_archive -s followed by some cp commands. 

If it's of any help, here's my backup script. 

#!/bin/bash

function errorabort {

  echo "NON-ZERO exit status! Aborting Keydump! Deleting failed backup! 
Alerting SysAdmin"
  echo "The SKS Keyserver automated database backup and keyring dump sequence 
encountered a fatal" > msg
  echo "error on "`date`". This should be investigated immediately. Until then, 
no further automated" >> msg
  echo "backups or keydumps will take place. A file called BAD_DB was created 
in the sks home" >> msg
  echo "directory. The automated script will not run while this file exists. 
When the database" >> msg
  echo "problem has been corrected, remove this file to re-enable the automatic 
backup and dumps." >> msg
  mail dan -s "SKS Backup routine failure!!" < msg
  rm -f msg
  rm -f ${TEST}/PTree/*
  rm -f ${TEST}/KDB/*
  exit;

}


PATH=$PATH:/usr/local/bin
# before we do anything, check to see if BAD_DB exists. If so consider the 
database unusable. Abort.
if [ -f ${HOME}/BAD_DB ] ; then
  exit 2;
fi


# step 1 - define environment variables
TEST=${HOME}/test_backup
DB=${HOME}/backup_db
NEW=${HOME}/newdump
OLD=${HOME}/olddump
WORK=${HOME}/workdump

# step 2 - backup existing databases

cd $HOME/KDB
# step 2.1 - remove old KBD logs
rm -f `db_archive`
[ $? -ne 0 ] && errorabort
# step 2.2 - copy database files across
cp `db_archive -s` ${TEST}/KDB
[ $? -ne 0 ] && errorabort
# step 2.3 - copy KDB logs across
cp log.* ${TEST}/KDB
# step 2.4 - remove old PTree logs
cd ../PTree
rm -f `db_archive`
# step 2.5 - copy PTree databases across
cp `db_archive -s` ${TEST}/PTree
[ $? -ne 0 ] && errorabort
# step 2.6 - copy PTree logs across
cp log.* ${TEST}/PTree

# step 3 - validate DB files
cd ${TEST}/KDB
for DB in `db_archive -s` ; do
  db_verify $DB
  if [ $? -ne 0 ]; then
    errorabort
  fi;
done

cd ${TEST}/PTree
for DB in `db_archive -s` ; do
  db_verify $DB
  if [ $? -ne 0 ]; then
    errorabort
  fi;
done



# step 4 - make keydump

rm -f ${WORK}/*
sks dump 50000 $WORK
[ $? -ne 0 ] && errorabort

cd ${WORK}
for FILE in *.pgp; do
  mv $FILE dungeon${FILE##sks-dump};
done
cd ..

rm -f ${OLD}/*
mv ${NEW}/* ${OLD}
mv ${WORK}/* ${NEW}

rm -f ${DB}/PTree/*
rm -f ${DB}/KDB/*
mv ${TEST}/KDB/* ${DB}/KDB
mv ${TEST}/PTree/* ${DB}/PTree










reply via email to

[Prev in Thread] Current Thread [Next in Thread]