Changes between Version 28 and Version 29 of GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-1


Ignore:
Timestamp:
05/23/12 12:32:55 (12 years ago)
Author:
chaos@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-1

    v28 v29  
    3636|| 5B         || [[Color(orange,Blocked)]]      ||                      || exoticket:28       || Tim is working with ExoGENI to get the vlan 1750 interface setup (exoticket:28) ||
    3737|| 5C         || [[Color(orange,Blocked)]]      ||                      || exoticket:29       || blocked on inquiry about how to determine what command runs a particular check; blocked on resolution of inconsistent site admin nagios access ||
    38 || 5D         ||                               ||                      ||                    || ready to test                                                                                                 ||
     38|| 5D         || [[Color(#63B8FF,In progress)]] ||                      ||                    || ready to test                                                                                                 ||
    3939|| 6A         ||                                ||                      ||                    || ready to test                                                                                                 ||
    4040|| 6B         ||                                ||                      ||                    || ready to test                                                                                                 ||
     
    679679   * The duration of the outage
    680680
     681==== Results of testing: 2012-05-23 ====
     682
     683E-mail notification for the chaos user on BBN rack nagios was configured on 2012-04-26.  As of 2012-05-23T12:00, i have received 5472 notification messages.
     684
     685Investigation of representative messages:
     686
     687Item 1: messages related to the `bbn-hn.exogeni.net` service "Multipath 360080e50002d03ac000002cc4f69a431":
     688 * On 2012-04-26, i received a problem report:
     689{{{
     690From: rack_bbn@bbn-hn.exogeni.net (OMD site rack_bbn)
     691Date: Thu, 26 Apr 2012 03:22:08 +0000
     692To: chaos@bbn.com
     693Subject: *** PROBLEM *** bbn-hn.exogeni.net / Multipath
     694 360080e50002d03ac000002cc4f69a431 is CRITICAL
     695
     696--SERVICE-ALERT-------------------
     697-
     698- Hostaddress: 192.1.242.3
     699- Hostname:    bbn-hn.exogeni.net
     700- Service:     Multipath 360080e50002d03ac000002cc4f69a431
     701- - - - - - - - - - - - - - - - -
     702- State:       CRITICAL
     703- Date:        2012-04-26 03:22:08
     704- Output:      CRIT - (mpathb) paths expected: 4, paths active: 2               
     705-
     706----------------------------------
     707}}}
     708 * On 2012-05-23, i received the recovery report:
     709{{{
     710From: rack_bbn@bbn-hn.exogeni.net (OMD site rack_bbn)
     711Date: Wed, 23 May 2012 14:38:46 +0000
     712To: chaos@bbn.com
     713Subject: *** RECOVERY *** bbn-hn.exogeni.net / Multipath
     714 360080e50002d03ac000002cc4f69a431 is OK
     715
     716--SERVICE-ALERT-------------------
     717-
     718- Hostaddress: 192.1.242.3
     719- Hostname:    bbn-hn.exogeni.net
     720- Service:     Multipath 360080e50002d03ac000002cc4f69a431
     721- - - - - - - - - - - - - - - - -
     722- State:       OK
     723- Date:        2012-05-23 14:38:46
     724- Output:      OK - (mpathb) paths expected: 2, paths active: 2
     725-
     726----------------------------------
     727}}}
     728 * The service history shows no other entries for this [https://bbn-hn.exogeni.net/rack_bbn/check_mk/view.py?host=bbn-hn.exogeni.net&site=&service=Multipath%20360080e50002d03ac000002cc4f69a431&view_name=svcevents]
     729 * The service recovered because Jonathan fixed an inaccurate check which was looking for 4 paths when it should have been looking for 2 paths.
     730 * The implication of these notices is that nagios sends notifications only when a service's state changes, and does not repeat notifications when a service remains in an unhealthy state.
     731
     732Item 2: messages related to the `8052.bbn.xo` services "Interface Ethernet30" and "Interface Ethernet40":
     733
    681734== Step 6: Setup contact info and change control procedures ==
    682735