|
From: | Jan-Henrik Haukeland |
Subject: | Re: [monit] Postgresql weirdness: ESTrootFATAL |
Date: | Thu, 13 Nov 2008 23:21:31 +0100 |
On Nov 13, 2008, at 11:11 PM, David Paper wrote:
Ah, I wish it had been that simple.In _ALL_ instances, postgres would start up completely, running as UID postgres, as it should be. The 4 examples (included below) of manually starting it all ultimately called pg_ctl as UID postgres.The tipoff ended up being the errors showing up in groups of 4 exactly every 22 seconds, and NOT showing up during postgres startup messages. Only after the DB was completely online did the error messages show up.In the monit job, there are 4 tests, 2 against the socket in /tmp and 2 against localhost port 5432.These 4 tests are what is generating the errors. Since it doesn't matter what UID monit starts the process as, monit itself always runs as root (at least in our installations it does), and thusly monit does all of its testing of processes as UID 0. This is what is generating the errors, as there is no root user in our postgres installs, or a root database.In doing some poking around the innerwebs, one of my coworkers found this:http://www.asahi-net.or.jp/~aa4t-nngk/monit2_en.html#pgsqltest [0]which basically says, "create a root user in postgres" and "create a database root, owned by root" if you want the pgsql tests to work correctly, and not spit out errors in the logs.Hope this problem & explanation helps others out as they spend time w/ Monit and Postgres.-dave [0] The text of the webpage is below: PostgreSQL connection testI may be responsible for explaining this since this is the test I wrote. If you use MONIT 4.7 or earlier, you need to apply pgsql- patch to make use of this protocol test.A connection test in MONIT is done by opening a connection to the socket which the service is listening, sending some packets and then MONIT will decide if the service is alive based on response the service returns (or none at all). `Socket' can be a TCP/UDP port or a UNIX socket. Before dealing with PGSQL test, we might need to take a look at MONIT's connection test in general.DNS service connection test example: if failed host localhost port 53 type udp protocol dns with timeout 10 then restartThe host argument can be ommited. In that case, host is assumed to be localhost. type defaults to TCP and you can ommit it if it is. protocol can be one of (as of MONIT 4.8) APACHE-STATUS, DNS, DWP, FTP, HTTP, IMAP, LDAP2, LDAP3, MYSQL, PGSQL, NNTP, NTP3, POP, POSTFIX-POLICY, RDATE, RSYNC, SMTP, SSH, TNS, and if you ommit it generic connection test will be used. timeout means how long MONIT will wait before it giives up, whose default value is 5 (seconds).`pgsql' is not very special in synopsis. Now, let us examine the case when we want to test PostgreSQL's activity through its UNIX socket.PostgreSQL connection test example (via UNIX socket): if failed unixsocket /tmp/.s.PGSQL.5432 proto pgsql with timeout 15 then restart Prerequisites for PGSQL testAs PostgreSQL requires authentication even merely to connect it, certain preparations need to be done before practical use of this test. This procedure is not mandatory because the PGSQL test assumes it to be success when PostgreSQL might demand authentication or tell you there be no such user since they both mean functionality of postmaster. However, you'd better follow the procedure below to keep Postgres' log as clean as possible, that was the very initial aim for which I wrote this code. We are going to create DB user `root' for convenience because of the fact that MONIT is usually run by root. The example below assumes PostgreSQL is 8.x. If yours is older, some synopsis such as subnet format may vary;1. Create DB user `root'.2. Create a database 'root' owned by root. It doesn't need to contain any data.3. Add these descriptions to pg_hba.conf;host root root 127.0.0.1/32 trust <= for test via TCP port local root root ident sameuser <= for test via UNIX socketOn Nov 13, 2008, at 4:41 PM, Dan Colish wrote:The funky error output is concatenation of error messages and nothing else. The real error is your start command. It looks like you're actually trying to start a database named root. Check the start scripts for pgsql. Doespostgres start outside of monit? --dan On Thu, Nov 13, 2008 at 4:18 PM, David Paper <address@hidden> wrote:Hi monit gurus, While I anxiously await 5.0 getting out of beta, I have run into thefollowing problem w/ postgres: When started via monit, postgres spits outthe following errors every 22 seconds in the postgres startup log:: 2008-11-13 16:07:04 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:04 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:04 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:04 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:26 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:26 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:26 ESTrootFATAL: database "root" does not exist 2008-11-13 16:07:26 ESTrootFATAL: database "root" does not exist 4 entries in each group, forever.Environment: Monit v 4.10.1 (started out of inittab), Postgres 8.3.4, SuSELinux 11.0 x86-64. This is what the monit job looks like:check process postgresql with pidfile /opt/postgres/data/ postmaster.pidgroup database start program = "/opt/postgres/bin/pg_ctl start" as uid postgres and gid postgres stop program = "/opt/postgres/bin/pg_ctl stop" as uid postgres and gid postgres if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql then restart if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql then alert if failed host localhost port 5432 protocol pgsql then restartif failed host localhost port 5432 protocol pgsql then alertif 5 restarts within 5 cycles then timeout I've also tried doing it like this:check process postgresql with pidfile /opt/postgres/data/ postmaster.pidgroup database start program = "/etc/init.d/postgresql start" stop program = "/etc/init.d/postgresql stop" if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql then restart if failed unixsocket /tmp/.s.PGSQL.5432 protocol pgsql then alert if failed host localhost port 5432 protocol pgsql then restartif failed host localhost port 5432 protocol pgsql then alertif 5 restarts within 5 cycles then timeout Same result.If I fire up postgres manually using the following methods, I don't see theerror: su - postgres; /opt/postgres/bin/pg_ctl startsu - postgres; /opt/postgres/bin/pg_ctl start -w -D /opt/postgres/ data -l/opt/postgres/data/startup.log su - postgres -c "LD_LIBRARY_PATH=/opt/postgres/lib /opt/postgres/bin/pg_ctl -w start -D \"/opt/postgres/data\" -l \"/opt/postgres/data/startup.log\"" (as root) /etc/init.d/postgresql start It seems that only when fired up via a monit job is this an issue.According to our DB guy, the error means that postgres is trying to find adatabase called "root", which of course, doesn't exist.I know that Monit doesn't set any environment variables at time of start upof a jobs process. but I'm baffled as to where this is coming from. Has anyone else that's running postgres seen this? Thanks! -dave -- Dave Paper "Hello, I must be going." --Groucho -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general-- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general-- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
[Prev in Thread] | Current Thread | [Next in Thread] |