gomd-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gomd-devel] <IRC> interesting IRC chat session about gomd...


From: Matthias Rechenburg
Subject: Re: [gomd-devel] <IRC> interesting IRC chat session about gomd...
Date: Sat, 27 Sep 2003 01:37:03 +0200
User-agent: KMail/1.4.3

Hey JP,

On Donnerstag 25 September 2003 23:18, Gian Paolo Ghilardi wrote:
> Hi all.
>
> First of all: CHPOX stuff will be implemented after the first pubblic beta
> phase (we're not so far... ;) ).
> Ok?

sure, no prob. You decide the time.

>
> > read the mails+irc-log about auth + checkpointing and
> > here are my comments.
> >
> > about the auth:
> > I prefer the auth.conf for gomd for now
> > (later we can see if we improve this)
> > The feature should combine acl.conf + scx.conf
>
> I prefer three separate files ({acl/scx/auth}.conf) because I dislike long
> and difficult-to-be-read files like smb.conf.

ok, agree

>
> > about the checkpointing:
> > It is a vey good idea to add this feature to the gomd :) !
> > The users will love it.
> > I just think that automatically checkpointing the processes
> > which needs most of the cpu will not work so well.
> >
> > Here are my thoughts about the chpox-support :
> >
> > 1) we need a register, unregister and list-register processes
> >    register should automatically check the binary program
> >    with ldd to find out which libraries needs to be added
> >    to chpox before. -> even if it is a script to register it
> >    needs some libs for e.g. bash
>
> np.
>
> > 2) unregister should unregister the process but not the libs
> >    because they may be needed by other registered processes
>
> ok.
>
> > 3) the registration will use a dump-file with the process + pid
> >    name to be sure to have unique names for the process dumps
>
> dump format: [CMD]_[PID]_[DATE]_[TIME].dump Ok?
> (ex for updatedb => "updatedb_2028_2003-09-25_23.07.dump")

yep, good

>
> > 4) if there are registered processes for checkpointing the gomd
> >    should send the checkpoint signal to them in an intervall
> >    (timeout can be static first, maybe later it would be nice if one
> >    can configure it per process)
>
> cmd format in future checkpoint.conf:
> COMMAND:CHECKPOINT_INTERVAL
> - COMMAND will be searched as in SCX stuff (if COMMAND =="sander" => every
> sander command will be checkpointed).
> - if (CHECKPOINT_INTERVAL == 0) => command won't be registered
> Ok?
>

ok

> > 5) if a process is checkpointed the gomd should/must move its
> >    dump file to a diffrent name containing a timestamp.
> >    Otherwise the next checkpoint will overwrite the current dump file
> >    even if the registered process is crashed but still alive. This will
> >    make it "un-restorable"   -> so we need timestamps for each
> >    checkpointed process-dump to be able to restore it at any given
> >    time (any given time a checkpoint was written).
> >    We have to take care of the disk-usage with this feature because
> >    it may create a lot of dumps which are not removed yet.
>
> In constants.h => CHPOX_MAX_DUMPS_PER_PID 5 => kill if new dump # is >
> CHPOX_MAX_DUMPS_PER_PID.
> Ok? :)

nope, sorry, this will kill the last "good" process dump after the 5
checkpoint sequences and will cause the same problem as
explained above ;)))
Also we cannot/should-not remove them when the process stops
or is being killed. The user might want to restore it in a week
or in a month.

Maybe we should just document it and give a "crontab entry"
to the users which sequentielle cleans up all dumps, then it
is the users choice if he/she remove them automatically
(or manually if not needed any more)

>
> > 6) we have to think about how and when to remove process dumps
>
> Check answer #5

checked ;)

>
> > 7) we should add a check if the chpox module is available, if not
> >    those register/unregister/checkpoint commands should be disabled
> >    or display a notice that the user have to install chpox first.
> >    This can be simply done by the gomd init script (insmoding chpox mod
> >    and check if it returns ok) and then start the gomd with an additional
> >    commandline parameter ....... just an idea.
>
> np.
>
> > If we have all this together the user can simple add checkpointing HA
> > to his/her processes.
> >
> :)))
> :
> > .... as usual just Matt's mind  ;))
>
> Great mind... ;)
>
> Byez.
>
> <rejected>

good night ;))

Matt
-- 
E-mail  :  address@hidden
www     : http://www.openmosixview.com
an openMosix-cluster management GUI

Reality is for those who can't face Science Fiction.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]