[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] problems with bmc-watchdog
From: |
Dave Love |
Subject: |
Re: [Freeipmi-users] problems with bmc-watchdog |
Date: |
Thu, 06 May 2010 00:00:31 +0100 |
User-agent: |
Gnus/5.110011 (No Gnus v0.11) Emacs/21.4 (gnu/linux) |
Al Chu <address@hidden> writes:
> Let's try some tests. Could you run bmc-watchdog "by hand" to make sure
> things look like it's working right? "by hand", I mean something like
> run:
>
> bmc-watchdog --get (see what the current watchdog settings are)
> bmc-watchdog --set ... (with same as deamon options, except not the
> reset interval '-e 60')
> bmc-watchdog --get (see that things are set)
> bmc-watchdog --start
> bmc-watchdog --get (make sure things changed, timer is running)
> bmc-watchdog --get (make sure timer is counting down)
> bmc-watchdog --reset
> bmc-watchdog --get (make sure timer has reset)
>
> (and you probably want to do bmc-watchdog --stop at the end)
I should have said I was puzzled by when it says Stopped. This is a
RH5, Sun ILOM 2 system (not ELOM as I thinko'd before).
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Set
Timer Use BIOS POST Flag: Set
Timer Use BIOS OS Load Flag: Set
Timer Use BIOS SMS/OS Flag: Set
Timer Use BIOS OEM Flag: Set
Initial Countdown: 900 seconds
Current Countdown: 900 seconds
# bmc-watchdog --set -u 4 -p 0 -a 1 -i 900
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Set
Timer Use BIOS POST Flag: Set
Timer Use BIOS OS Load Flag: Set
Timer Use BIOS SMS/OS Flag: Set
Timer Use BIOS OEM Flag: Set
Initial Countdown: 900 seconds
Current Countdown: 900 seconds
# bmc-watchdog --start
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 900 seconds
# sleep 2
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 898 seconds
# bmc-watchdog --reset
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 900 seconds
> This can help us isolate things. If the above works, then maybe there
> is a timing issue within your BMC that we need to get around. I'm a
> little perplexed as to why it would work with the openipmi driver. It's
> possible it's more generous on some timeouts of packets and such. Or
> maybe the openipmi driver's own watchdog implementation/code has done
> something to massage the BMC that I'm unaware of.
I probably wasn't clear. What I meant was:
# bmc-watchdog -g --config-file /dev/null
ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7)
ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7)
ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7)
bmc-watchdog: Get Watchdog Timer Error: BMC Busy
in contrast to:
# bmc-watchdog -g --config-file /dev/null -D OPENIPMI|head -1
Timer Use: SMS/OS
...
and
# bmc-info --config-file /dev/null
Device ID : 32
...
Actually now it's obvious there's something wrong with the ILOM, thanks.
I've now tried on an x2200M2 with ELOM with the results below (and I
don't have to specify the openipmi driver). I guess I won't get
anywhere with a service request on this -- especially as I'm only doing
it because Sun couldn't fix the hangups on the Thumper -- but perhaps
you have a simple idea for a fix?
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 0 seconds
# bmc-watchdog --set -u 4 -p 0 -a 1 -i 900
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Stopped
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 0 seconds
# bmc-watchdog --start
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Running
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 900 seconds
# sleep 2
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Running
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 898 seconds
# bmc-watchdog --reset
# bmc-watchdog --get
Timer Use: SMS/OS
Timer: Running
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 899 seconds