monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: additional feature for monit-3.0 (for clusters)


From: Oliver Jehle
Subject: Re: additional feature for monit-3.0 (for clusters)
Date: Mon, 28 Oct 2002 13:24:30 +0100

No problems by inserting a new config statement...

but it should be a xor with autostart... you cannot have autostart=true and
monitor only manual started :

so i think, giving the autostart a better name would be easier !!!




Martin Pala wrote:

> Yet one thing - i think that it shouldn't be part of 'autostart'
> statement (it is intended for other task). Maybe new statement such as
> 'automonitor [yes|no]' (it sounds strange - maybe someone will had
> better hint :) will be more clear, if such functionality will be added
> to monit.
>
> Martin
>
> Martin Pala wrote:
>
> >Yeah, not bad idea :)
> >
> >there are two ways to reach similar feature:
> >
> >1.) check process only when started under monit's control as described
> >Oliver - it is very simple and effective method, every cluster node needs
> >only one 'local' monit instance.
> >
> >2.) have monit instance failover with service as part of resource group - in
> >such case it must be installed on shared disks and when cluster
> >reconfiguration is initialized it will start monit process with resource as
> >well => there should be one monit instance per resource (or more accurately
> >per shared disk group on which SCSI reservation is applied). This method
> >doesn't require big monit modification - the only one needed will be to have
> >option for specification of monit's pid file location somewhere in the
> >filesystem (it should be on shared disk group). Resource failover is
> >transparent for monit - it needn't care about shared environment, it will
> >just start/stop itself and monitor/start services => cluster health and
> >shared storage must be monitored/maintained by other service (as for example
> >by mentioned heartbeat).
> >
> >First (Oliver's) method is similar to object registration (as in SUN
> >cluster's pmfadm for example) - it will allow with this extension build
> >simple clusters. There's yet another question - storage maintanance, two
> >ways i think about:
> >
> >a.) described rcscripts (monit-node1 and monit-node2, etc.) will be
> >responsible for storage maintenance (storage reserve/release and optionaly
> >forcing of it). They shouldn't start 'monit -g service start' before the
> >node masters the storage => it may lead to hard error before touching monit
> >subsystem (similar as in above mentioned variant 2.)
> >
> >b.) start/stop scripts involved by monit will be more sophisticated and will
> >check/maintain storage status (possible do scsi reservation in the case that
> >the node doesn't master it) before trying to start service. While monit
> >currently doesn't watch for return value of these scripts, in the case of
> >failure it will lead to service timeout on monit's level.
> >
> >
> >It is possible to allow one or both methods (variant 1. needs Oliver's
> >patch, variant 2. needs optional pid file location patch).
> >
> >+1 for Oliver's way
> >
> >Maybe it will be usefull for others to have 'howto' for building simple
> >clusters with use of monit :)
> >
> >Greetings,
> >Martin
> >
> >
> >----- Original Message -----
> >From: "Jan-Henrik Haukeland" <address@hidden>
> >To: <address@hidden>
> >Sent: Wednesday, October 23, 2002 7:47 AM
> >Subject: Re: additional feature for monit-3.0 (for clusters)
> >
> >
> >
> >I spoke with Oliver off list and asked him to send a mail to the list
> >for discussion, so does anyone have an opinion on this?
> >
> >Oliver Jehle <address@hidden> writes:
> >
> >
> >
> >>when using heartbeat and groups in monit, i've missed following feature.
> >>
> >>monit should only monitor manualy started resources . and after stopping
> >>it, monit should stop monitor it.
> >>
> >>so i've implemented a third input value for the autostart "started". now
> >>monit monitors a resource only, if you start it with "monit start"...
> >>
> >>why that... see below.... it's my config for hearbeat with monit
> >>
> >>on every node
> >>
> >>/etc/inittab starts monit
> >>/etc/rc3.d/ script execute "monit start heartbeat"
> >>/etc/init.d/ monit-node1 "monit -g node1 start"
> >>/etc/init.d/ monit-node2 "monit -g node2 start"
> >>
> >>so hearbeat can control easy the cluster state and if one node fails,
> >>hearbeat starts monit-xxxx of the failing node and  monit is instructed
> >>to start the services of the failing-node and monitor them...
> >>
> >>
> >>--
> >>Oliver Jehle
> >>Monex AG
> >>Föhrenweg 18
> >>FL-9496 Balzers
> >>
> >>Tel: +423 388 1988
> >>Fax: +423 388 1980
> >>
> >>----
> >>I've not lost my mind. It's backed up on tape somewhere.
> >>----
> >>
> >>
> >>
> >
> >--
> >Jan-Henrik Haukeland
> >
> >
> >_______________________________________________
> >monit-dev mailing list
> >address@hidden
> >http://mail.nongnu.org/mailman/listinfo/monit-dev
> >
> >
> >
> >_______________________________________________
> >monit-dev mailing list
> >address@hidden
> >http://mail.nongnu.org/mailman/listinfo/monit-dev
> >
> >
>
> _______________________________________________
> monit-dev mailing list
> address@hidden
> http://mail.nongnu.org/mailman/listinfo/monit-dev





reply via email to

[Prev in Thread] Current Thread [Next in Thread]