[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gomd-devel] <DAEMON>: (updated) .plan

From: Gian Paolo Ghilardi
Subject: Re: [gomd-devel] <DAEMON>: (updated) .plan
Date: Sat, 6 Sep 2003 14:11:44 +0200

Hi all.

> Hey JP,
> On Donnerstag 04 September 2003 21:48, Gian Paolo Ghilardi wrote:
> > Hi all.
> >
> > First of all, I've just reworked the shell output format (more readable,
> > IMHO).
> :)
> >
> > BTW I'm studying (yes: studying) to implement the Black Hole Detection.
> > As I don't like simple treesholds (because they are "aseptic" and prone
> > false positives), I'd like to build a small and quick Fuzzy Logic engine
> > get the _real_ status of the cluster including malfunctioning nodes
> > detection (aka "black holes").
> ok, could you explain a bit more about those rulez, threshholds and black
> holes ;) Couldn't we just simply not count the values got from a
> malfunction node ? ..... i am not sure i fully got it.
Ok. Maybe my words weren't so clear...

1)I call "black hole" (in cluster context) a malfunctioning node.
The purpose is not only to detect and signal out-of-order nodes but
understand why they are in that situation.

2)To get the status of a node (and signal if something is gonna fail or it
already failed) there are different approaches.
Old school uses threesholds. A threeshold is a "limit number" (for example a
critical temperature for the CPU).
If we're under this value all goes right. Above this value an
alarm/error/exception _must_ be raised/thrown.
As you notice this mechanism is simplistic but easy to implement.
Unfortunately it's (too) prone to errors: each erroneous input _implies_ a
"false positive" or "false negative" (as each output depends only on the
Moreover the thresholded approach is "dicotomic": true or false, correct or
incorrect, 0 or 1,... => the system must decide and classify a situation in
just two "cases" => situation A or not-A.

As I dislike treesholds (they often require long periods to tune up them and
the resulting compromise often is terribly ugly) I'd liek to use a
Fuzzy-Logic-based approach.

The idea is simple. Instead of setting limit values, we define "factual
rules", something like "if it's too cool then increase heat power".
Obviously these rules can be adapted to use values (as particulars ranges,
the fuzzy sets).
As fact like "it's too cold" are subjective and "not crisp", we enter a
"fuzzy" world, without crisp, well-defined boundaries.
So we can't speak _only_ about "system working" or "system not working": now
we have the complete gray scale between the white and the black (system with
incoming failure, system perfectly working, system on a critical
In a such context erroneous inputs (can) have minimal impact on the system

For example the Antiblocking Braking System (ABS) of many vendors now uses
FL instead of threesholds: with a incredibly small number of rules the
system is able to perfectly handle all situations and can be easily made
"adaptive". Moreover small number of rules => a few Kbytes requested as
memory => effective systems.

IMHO oM could be dramatically faster and better with an FL approach (to
decide procs migration, etc...).
If I had time I'd like to try to implement an FL engine for oM.
I'd like to talk with Moshe about this proposal.

> > FL is not so difficult => we've only to define rules (based on a small,
> > finite set of selected parameters) instead of using value-based
> > threesholds. We have a finite number of rules to control the whole BHD
> > mechanism.
> >
> > As this is my first attempt with FL, I can guarantee nothing but I'd
> > to walk this way... :)
> go ahead :)
Ok... ;)

> >
> > Param that can be used:
> > - cpu, hdd, chipset temperatures (via lmsensors)
> > - load status
> > - SMART status (hdd integrity check)
> > - ...
> >
> > Any comment/help/suggestion is welcome.
> will be out today, hopefully find some time for testing tomorrow.
> More comments then  ;))
ok... thanks... ,)

> >
> > Byez.
> >
> > <rejected>
> >
> >
> > << CVS NOTES >>
> >
> > IN *.cpp
> > (+) cosmetics => reworked shell output format
> >
> >
> >
> > _______________________________________________
> > gomd-devel mailing list
> > address@hidden
> > http://mail.nongnu.org/mailman/listinfo/gomd-devel
> all the best,
> Matt
> --
> E-mail :  address@hidden
> www : http://www.openmosixview.com
> an openMosix-cluster management GUI
> Men are from Mars.  Women are from Venus. Idiots are Universal.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]