[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Sh-X] MySQL and Shaman-X design draft proposal (long mail)

From: Dominique Chabord
Subject: [Sh-X] MySQL and Shaman-X design draft proposal (long mail)
Date: Sat, 20 Mar 2004 10:28:19 +0100


These are some comments and open proposals based on the document MySQL
Replication - Failover assumptions. Thanks to Philipp Smith at Marketgrid
for this document.

I'm coming to the conclusion, that probably 2.3, custom described in the
document, is the one we
should focus on.

I had some thoughts about how we could manage this topology easily and
safely with wdx product. I think I am close to an elegant proposal.

In manual mode, Switch to slave, switching back and slave configuration
procedures are designed to minimize system reconfiguration to be made
manually. I agree that if things are done manually, it should remain as
simple as

Let's consider "custom 2.3" topology: (I call Head-of-slaves the slave which
is connected to master)
If I understand correctly, the administrator identifies several situations
which are:
    - all OK: I do nothing
    - Master is down: I switch to "Head-of-slaves"
    - Master is back: I switch back to it
    - Head-of-slaves is down: I reconfigure slaves (I guess with a new

Therefore, the operational procedure suggested is :
1- Identify the situation
2- process the right sequence

If we consider multiple failures and site splitting, probably other
combinations should be listed. This is where it becomes complex in this
approach, because situatuions can be myny if we intend to cover all
potential risks.

So now, what about automating this in Shaman-X's way?
Shaman-X is based on wdx which has no centralized referee.
A wdx is a kind of collaborative "team" of computers ;-)
We program directives (scripts) that are repeated on all computers, so they
"understand what the global stake is about and contribute if they can. This
is the framework I use below.

We must distinguish two cases:
case 1- the operator wants to start, stop, re-arrange his servers
case 2- a failure happens and the system rearranges itself. (high

In wdx "philosophy", we tend to consider that the same mechanisms should be
used for both cases.
Basically we do this:

1- the system is in operational situation
2- an event happens: All computers of the system react individually to this
3- the system is back to operational situation

To do this, I propose that we "educate" every computer whith the following

1- objectives (per design):
O1- we always need an up to date master : wdx does this by assigning a
mastertoken to this role in a mission called master_db ("mastertoken" is a
keyword of wdx)
O2- slave computers which have up-to-date data can replace the master in
case it disappears. Therefore, slaves join master_db mission when they are
up to date. They leave it if they are not.
O3- we always need an designated head-of-slaves : therefore, wdx must assign
a second mastertoken to this role in a second mission called slave_db
04- all slaves but the master can replace head-of-slaves, but the master.
Therefore, all computers join slave_db mission when they start, they leave
it if they become master, they join back if they are no longer master.

2- behavior of a good citizen computer (scripts):
P1- if the master fails, "I" propose myself to do the job if I am an
up-to-date slave.
P2- if a new master is identified, I use it to update my data if I am
P3- if the head-of slaves fails, I propose myself to become a new head-of
slaves if I am a slave (I don't need to be up to date)
P4- if a new head-of-slaves is identified, I use it to update my data if I
am a slave
P5- my data catchup is completed, I am an up to date slave, I join master_db
P6- for any reason my data are no longer updated, I leave master_db mission,
my data are not up-to-date
P7- if I am told to become a master, I reconfigure myself if I was up to
date and I leave slave_db mission
P6- if I am told to resign from master, I reconfigure myself and join back
slave_db mission
P7- if head-of-slave is not up-to-date, I am not up-to date either if I am a
P8- if I 'm back from boot, I consider my data are not up-to-date, I join
slave_db mission, I identify head-of-slave (if any) to catch up from

3- operator's actions (graphic interface or command line):
A1- resign from master role: "wdx master_db -nomaster" frees the mastertoken
and substitutes of master_db will compete to get it. wdx tells the winner to
become masterholder
A2- replace the master: "wdx master_db -getmaster" grabs mastertoken and
creates a double master situation. The previous masterholder cancels its
A3- resign from head-of-slave: "wdx slave_db -nomaster" and a new substitute
of slave_db will get it
A4- replace the head-of-slaves: "wdx slave-db -getmaster" and the former
master must resign the member is now masterholder.

4- failures:
F1- master failure: substitutes of master_db mssion compete for the role
(equivalent to -nomaster command)
F2- head-of slaves failure: substitutes of slave_db mssion compete for the
role (equivalent to -nomaster command)

5- system start and stop
I1- stop database: several possibilities, example: "wdx
master_db -exception" triggers a stop_script on all computers of master_db
I2- start database: consequence of P7 + P8 is that at general power on, no
computer will join master_db mission and become master. Either we can
calculate which is the most up-to-date computer or we wake for the operator
to pick the right one. "wdx slave_db -reserve preferred_master" can make the
computer to join master_db mission. It will then become master because of
P1; P2 will result in at least one up-to-date slave(s)

6- disaster situations if computers are split between two sites
D1- A site is declared down: all its computers resign from their respective
roles (master and head-of-slaves) and leave both master_db and slave_db
If a slave is still valid it may continue to update itself (I guess)
D2- The site is back: Computers join slave_db mission


Are there some cases which remain uncovered by this design ?

Any comment welcome
Best regards

reply via email to

[Prev in Thread] Current Thread [Next in Thread]