gomd-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gomd-devel] <DAEMON>: (updated) .plan


From: Matthias Rechenburg
Subject: Re: [gomd-devel] <DAEMON>: (updated) .plan
Date: Sat, 6 Sep 2003 15:18:43 +0200
User-agent: KMail/1.4.3

thanks for the explanation :) now i got it .... and like it too.

have fun,

Matt

On Samstag 06 September 2003 14:11, Gian Paolo Ghilardi wrote:
> Hi all.
>
> > Hey JP,
> >
> > On Donnerstag 04 September 2003 21:48, Gian Paolo Ghilardi wrote:
> > > Hi all.
> > >
> > > First of all, I've just reworked the shell output format (more
> > > readable, IMHO).
> > >
> > :)
> > :
> > > BTW I'm studying (yes: studying) to implement the Black Hole Detection.
> > > As I don't like simple treesholds (because they are "aseptic" and prone
>
> to
>
> > > false positives), I'd like to build a small and quick Fuzzy Logic
> > > engine
>
> to
>
> > > get the _real_ status of the cluster including malfunctioning nodes
> > > detection (aka "black holes").
> >
> > ok, could you explain a bit more about those rulez, threshholds and black
> > holes ;) Couldn't we just simply not count the values got from a
> > malfunction node ? ..... i am not sure i fully got it.
>
> Ok. Maybe my words weren't so clear...
>
> 1)I call "black hole" (in cluster context) a malfunctioning node.
> The purpose is not only to detect and signal out-of-order nodes but
> understand why they are in that situation.
>
> 2)To get the status of a node (and signal if something is gonna fail or it
> already failed) there are different approaches.
> Old school uses threesholds. A threeshold is a "limit number" (for example
> a critical temperature for the CPU).
> If we're under this value all goes right. Above this value an
> alarm/error/exception _must_ be raised/thrown.
> As you notice this mechanism is simplistic but easy to implement.
> Unfortunately it's (too) prone to errors: each erroneous input _implies_ a
> "false positive" or "false negative" (as each output depends only on the
> inputs).
> Moreover the thresholded approach is "dicotomic": true or false, correct or
> incorrect, 0 or 1,... => the system must decide and classify a situation in
> just two "cases" => situation A or not-A.
>
> As I dislike treesholds (they often require long periods to tune up them
> and the resulting compromise often is terribly ugly) I'd liek to use a
> Fuzzy-Logic-based approach.
>
> The idea is simple. Instead of setting limit values, we define "factual
> rules", something like "if it's too cool then increase heat power".
> Obviously these rules can be adapted to use values (as particulars ranges,
> the fuzzy sets).
> As fact like "it's too cold" are subjective and "not crisp", we enter a
> "fuzzy" world, without crisp, well-defined boundaries.
> So we can't speak _only_ about "system working" or "system not working":
> now we have the complete gray scale between the white and the black (system
> with incoming failure, system perfectly working, system on a critical
> situation,...).
> In a such context erroneous inputs (can) have minimal impact on the system
> behaviour.
>
> For example the Antiblocking Braking System (ABS) of many vendors now uses
> FL instead of threesholds: with a incredibly small number of rules the
> system is able to perfectly handle all situations and can be easily made
> "adaptive". Moreover small number of rules => a few Kbytes requested as
> memory => effective systems.
>
> IMHO oM could be dramatically faster and better with an FL approach (to
> decide procs migration, etc...).
> If I had time I'd like to try to implement an FL engine for oM.
> I'd like to talk with Moshe about this proposal.
>
> > > FL is not so difficult => we've only to define rules (based on a small,
> > > finite set of selected parameters) instead of using value-based
> > > threesholds. We have a finite number of rules to control the whole BHD
> > > mechanism.
> > >
> > > As this is my first attempt with FL, I can guarantee nothing but I'd
>
> like
>
> > > to walk this way... :)
> >
> > go ahead :)
>
> Ok... ;)
>
> > > Param that can be used:
> > > - cpu, hdd, chipset temperatures (via lmsensors)
> > > - load status
> > > - SMART status (hdd integrity check)
> > > - ...
> > >
> > > Any comment/help/suggestion is welcome.
> >
> > will be out today, hopefully find some time for testing tomorrow.
> > More comments then  ;))
>
> ok... thanks... ,)
>
> > > Byez.
> > >
> > > <rejected>
> > >
> > >
> > > << CVS NOTES >>
> > >
> > > IN *.cpp
> > > (+) cosmetics => reworked shell output format
> > >
> > >
> > >
> > > _______________________________________________
> > > gomd-devel mailing list
> > > address@hidden
> > > http://mail.nongnu.org/mailman/listinfo/gomd-devel
> >
> > all the best,
> >
> > Matt
> > --
> > E-mail :  address@hidden
> > www : http://www.openmosixview.com
> > an openMosix-cluster management GUI
> >
> > Men are from Mars.  Women are from Venus. Idiots are Universal.

-- 
E-mail  :  address@hidden
www     : http://www.openmosixview.com
an openMosix-cluster management GUI

 
# make sense
make: don't know how to make sense. Stop





reply via email to

[Prev in Thread] Current Thread [Next in Thread]