Changes between Initial Version and Version 1 of GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-1


Ignore:
Timestamp:
05/03/12 16:54:12 (12 years ago)
Author:
chaos@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-1

    v1 v1  
     1[[PageOutline]]
     2
     3= Detailed test plan for EG-ADM-1: Rack Receipt and Inventory Test =
     4
     5''This page is GPO's working page for performing EG-ADM-1.  It is public for informational purposes, but it is not an official status report.  See [wiki:GENIRacksHome/ExogeniRacks/AcceptanceTestStatus] for the current status of ExoGENI acceptance tests.''
     6
     7== Page format ==
     8
     9 * The status chart summarizes the state of this test
     10 * The high-level description from test plan contains text copied exactly from the public test plan and acceptance criteria pages.
     11 * The steps contain things i will actually do/verify:
     12   * Steps may be composed of related substeps where i find this useful for clarity
     13   * Each step is identified as either "(prep)" or "(verify)":
     14     * Prep steps are just things we have to do.  They're not tests of the rack, but are prerequisites for subsequent verification steps
     15     * Verify steps are steps in which we will actually look at rack output and make sure it is as expected.  They contain a '''Using:''' block, which lists the steps to run the verification, and an '''Expect:''' block which lists what outcome is expected for the test to pass.
     16
     17== Status of test ==
     18
     19Meaning of states:
     20 * [[Color(green,okay)]]: Step is completed and passed (for a verification step), or is completed (for a prep step)
     21 * [[Color(red,failed)]]: Step is completed and failed, and is not being revisited
     22 * in progress: We are currently testing or iterating on this step
     23 * [[Color(orange,waiting)]]: Step is blocked by some other step or activity
     24
     25|| '''Step''' || '''State'''           || '''Date completed'''    || '''Comments''' ||
     26|| 1          || [[Color(green,okay)]] || 2012-02-23 - 2012-02-24 ||                ||
     27|| 2A         || [[Color(orange,waiting)]] ||                     || blocked on a full IP-to-hostname mapping for the subnet (vjo) ||
     28|| 2B         || [[Color(orange,waiting)]] ||                         || blocked on 2A ||
     29|| 2C         || [[Color(orange,waiting)]] ||                         || blocked on 2B ||
     30|| 3A         ||                       ||                         || i think this works and just need to check it ||
     31|| 3B         || [[Color(orange,waiting)]] ||                         || blocked on resolving LDAP/RADIUS/serial issues in the rack (ckh) ||
     32|| 3C         ||                       ||                         || i think this works and just need to check it ||
     33|| 3D         ||                       ||                         || i think this works and just need to check it ||
     34|| 3E         || [[Color(orange,waiting)]] ||                         || blocked on a working bare mental node implementation (vjo) ||
     35|| 4A         ||                       ||                         || not blocked, needs to be done||
     36|| 4B         ||                       ||                         || not blocked, needs to be done ||
     37|| 4C         ||                       ||                         || not blocked, needs to be done ||
     38|| 4D         || [[Color(orange,waiting)]] ||                         || blocked on RENCI submitting the govtprop form ||
     39|| 5A         ||                       ||                         || i think this works and just need to check it ||
     40|| 5B         ||                       ||                         || Tim needs to negotiate getting the 1750 interface setup (GST [gst:3667]) ||
     41|| 5C         ||                       ||                         || i think this works and just need to check it||
     42|| 5D         ||                       ||                         || not blocked, needs to be done ||
     43|| 6A         ||                       ||                         || i think this works and just need to check it||
     44|| 6B         || [[Color(orange,waiting)]] ||                         || blocked on RENCI submitting the information to GMOC ||
     45|| 6C         ||                       ||                         || i think this is done ||
     46
     47== High-level description from test plan ==
     48
     49This "test" uses BBN as an example site by verifying that we can do all the things we need to do to integrate the rack into our standard local procedures for systems we host.
     50
     51=== Procedure ===
     52
     53 * ExoGENI and GPO power and wire the BBN rack
     54 * GPO configures the exogeni.gpolab.bbn.com DNS namespace and 192.1.242.0/25 IP space, and enters all public IP addresses for the BBN rack into DNS.
     55 * GPO requests and receives administrator accounts on the rack and read access to ExoGENI Nagios for GPO sysadmins.
     56 * GPO inventories the physical rack contents, network connections and VLAN configuration, and power connectivity, using our standard operational inventories.
     57 * GPO, ExoGENI, and GMOC share information about contact information and change control procedures, and ExoGENI operators subscribe to GENI operations mailing lists and submit their contact information to GMOC.
     58
     59=== Criteria to verify as part of this test ===
     60
     61 * VI.02. A public document contains a parts list for each rack. (F.1)
     62 * VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
     63 * VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
     64 * VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
     65 * VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
     66 * VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with Legal, Law Enforcement and Regulatory(LLR) Plan, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
     67 * VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
     68 * VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
     69 * VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)
     70
     71== Step 1 (prep): ExoGENI and GPO power and wire the BBN rack ==
     72
     73This was done on 2012-02-23 and 2012-02-24, and Chaos took rough notes at [wiki:ChaosSandbox/ExogeniRackNotes].
     74
     75== Step 2: Configure and verify DNS ==
     76
     77''(This is GST [gst:3354] item 5.)''
     78
     79=== Step 2A (verify): Find out what IP-to-hostname mapping to use ===
     80
     81'''Using:'''
     82 * If the rack IP requirements documentation for the rack exists:
     83   * Review that documentation and determine what IP to hostname mappings should exist for `192.1.242.0/25`
     84 * Otherwise:
     85   * Iterate with `exogeni-ops` to determine the IP to hostname mappings to use for `192.1.242.0/25`
     86
     87'''Expect:'''
     88 * Reasonable IP-to-hostname mappings for 126 valid IPs allocated for ExoGENI use in `192.1.242.0/25`
     89
     90=== Step 2B (prep): Insert IP-to-hostname mapping in DNS ===
     91
     92 * Fully populate `192.1.242.0/25` PTR entries in GPO lab DNS
     93 * Fully populate `exogeni.gpolab.bbn.com` PTR entries in GPO lab DNS
     94
     95=== Step 2C (verify): Test all PTR records ===
     96
     97'''Using:'''
     98 * From a BBN desktop host:
     99{{{
     100for lastoct in {1..127}; do
     101host 192.1.242.$lastoct
     102done
     103}}}
     104
     105'''Expect:'''
     106 * All results look like:
     107{{{
     108$lastoct.242.1.192.in-addr.arpa domain name pointer <something reasonable>
     109}}}
     110 and none look like:
     111{{{
     112Host $lastoct.242.1.192.in-addr.arpa. not found: 3(NXDOMAIN)
     113}}}
     114
     115== Step 3: GPO requests and receives administrator accounts ==
     116
     117=== Step 3A: GPO requests access to head node ===
     118
     119''(This is GST [gst:3354] item 2a.)''
     120
     121'''Using:'''
     122 * Request accounts for GPO ops staffers on bbn-hn.exogeni.gpolab.bbn.com
     123 * Chaos tries to SSH to chaos@bbn-hn.exogeni.gpolab.bbn.com
     124 * Josh tries to SSH to jbs@bbn-hn.exogeni.gpolab.bbn.com
     125 * Tim tries to SSH to tupty@bbn-hn.exogeni.gpolab.bbn.com
     126 * Chaos tries to run a minimal command as sudo:
     127{{{
     128sudo whoami
     129}}}
     130
     131'''Verify:'''
     132 * Logins succeed for Chaos, Josh, and Tim
     133 * The command works:
     134{{{
     135$ sudo whoami
     136root
     137}}}
     138
     139=== Step 3B: GPO requests access to network devices ===
     140
     141''(This is GST [gst:3354] item 2f.)''
     142
     143'''Using:'''
     144 * Request accounts for GPO ops staffers on network devices 8052.bbn.xo (management) and 8264.bbn.xo (dataplane) from exogeni-ops
     145
     146'''Verify:'''
     147 * I know what hostname or IP address to login to to reach each of the 8052 and 8264 switches
     148 * I know where to login to each of 8052 and 8264
     149 * I can successfully perform those logins at least once
     150 * I can successfully run a few test commands to verify enable mode:
     151{{{
     152show running-config
     153show mac-address-table
     154}}}
     155
     156=== Step 3C: GPO requests access to worker nodes running under OpenStack ===
     157
     158''(This is GST [gst:3354] item 2c.)''
     159
     160'''Using:'''
     161 * From bbn-hn, try to SSH to bbn-w1
     162 * From bbn-hn, try to SSH to bbn-w2
     163 * From bbn-hn, try to SSH to bbn-w3
     164 * From bbn-hn, try to SSH to bbn-w4
     165
     166'''Verify:'''
     167 * For each connection, either the connection succeeds or we can verify that the node is not an OpenStack worker.
     168
     169=== Step 3D: GPO requests access to IPMI management interfaces for workers ===
     170
     171''(This is GST [gst:3354] item 2b.)''
     172
     173'''Using:'''
     174 * GPO requests VPN access to the worker node IPMI management interfaces from exogeni-ops
     175 * From my laptop, connect a VPN using the exogeni-vpn bundle configuration and credentials
     176 * Once connected, browse to each of:
     177   * [http://bbn-w1.bbn.xo/]
     178   * [http://bbn-w2.bbn.xo/]
     179   * [http://bbn-w3.bbn.xo/]
     180   * [http://bbn-w4.bbn.xo/]
     181   * [http://bbn-hn.bbn.xo/]
     182
     183'''Verify:'''
     184 * VPN connection succeeds
     185 * Login to each IMM succeeds
     186 * Launching the remote console at each IMM succeeds
     187
     188=== Step 3E: GPO gets access to allocated bare metal worker nodes by default ===
     189
     190''(This is GST [gst:3354] item 2d.)''
     191
     192'''Prerequisites:'''
     193 * A bare metal node is available for allocation by xCAT
     194 * Someone has successfully allocated the node for a bare metal experiment
     195
     196'''Using:'''
     197 * From bbn-hn, try to SSH into root on the allocated worker node
     198
     199'''Verify:'''
     200 * We find out the IP address/hostname at which to reach the allocated worker node
     201 * We find out the location of the SSH private key on bbn-hn
     202 * Login using this SSH key succeeds.
     203
     204== Step 4: GPO inventories the rack based on our own processes ==
     205
     206=== Step 4A: Inventory and label physical rack contents ===
     207
     208''(This covers GST [gst:3354] items 3 and 7.)''
     209
     210'''Using:'''
     211 * Enumerate all physical objects in the rack
     212 * Use [https://wiki.exogeni.net/doku.php?id=public:hardware:rack_layout] to determine the name of each object
     213 * If any objects can't be found there, compare to [wiki:ChaosSandbox/ExogeniRackNotes], and iterate with RENCI
     214 * Physically label each device in the rack with its name on front and back
     215 * Inventory all hardware details for rack contents on OpsHardwareInventory
     216 * Add an ascii rack diagram to OpsHardwareInventory
     217
     218'''Verify:'''
     219 * [https://wiki.exogeni.net/doku.php?id=public:hardware:rack_layout] contains all devices in the rack
     220 * There is a public parts list which matches the parts we received
     221 * We succeed in labelling the devices and adding their hardware details and locations to our inventory
     222
     223=== Step 4B: Inventory rack power requirements ===
     224
     225'''Using:'''
     226 * Add rack circuit information to OpsPowerConnectionInventory
     227
     228'''Verify:'''
     229 * We succeed in locating and documenting information about rack power circuits in use
     230
     231=== Step 4C: Inventory rack network connections ===
     232
     233'''Using:'''
     234 * Add all rack ethernet and fiber connections and their VLAN configurations to OpsConnectionInventory
     235 * Add static rack OpenFlow datapath information to OpsDpidInventory
     236
     237'''Verify:'''
     238 * We are able to identify and determine all rack network connections and VLAN configurations
     239 * We are able to determine the OpenFlow configuration of the rack dataplane switch
     240
     241=== Step 4D: Verify government property accounting for the rack ===
     242
     243''(This is GST [gst:3354] item 11.)''
     244
     245'''Using:'''
     246 * Receive a completed DD1149 form from RENCI
     247 * Receive and inventory a property tag number for the BBN ExoGENI rack
     248
     249'''Verify:'''
     250 * The DD1149 paperwork is complete to BBN government property standards
     251 * We receive a single property tag for the rack, as expected
     252
     253== Step 5: Configure operational alerting for the rack ==
     254
     255=== Step 5A: GPO installs active control network monitoring ===
     256
     257''(This is GST [gst:3354] item 8.)''
     258
     259'''Using:'''
     260 * Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.2
     261 * Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.3
     262 * Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.4
     263
     264'''Verify:'''
     265 * Active monitoring of the control network is successful
     266 * Each monitored IPs is successfully available at least once
     267
     268=== Step 5B: GPO installs active shared dataplane monitoring ===
     269
     270''(This is GST [gst:3354] item 9.)''
     271
     272'''Using:'''
     273 * Add a monitored dataplane network ping from a lab dataplane test host on vlan 1750 to the rack dataplane
     274 * If necessary, add an openflow controller to handle traffic for the monitoring subnet
     275
     276'''Verify:'''
     277 * Active monitoring of the dataplane network is successful
     278 * The monitored IP is successfully available at least once
     279
     280=== Step 5C: GPO gets access to nagios information about the BBN rack ===
     281
     282''(This is part of GST [gst:3354] item 10.)''
     283
     284'''Using:'''
     285 * Browse to [https://bbn-hn.exogeni.net/rack_bbn/]
     286 * Login using LDAP credentials
     287
     288'''Verify:'''
     289 * Login succeeds
     290 * I can see a number of types of devices
     291 * I can click on a problem report and verify its details
     292
     293=== Step 5D: GPO receives e-mail about BBN rack nagios alerts ===
     294
     295''(This is part of GST [gst:3354] item 10.)''
     296
     297'''Using:'''
     298 * Request e-mail notifications for BBN rack nagios to be sent to GPO ops
     299 * Collect a number of notifications
     300 * Inspect three representative messages
     301
     302'''Verify:'''
     303 * E-mail messages about rack nagios are received
     304 * For each inspected message, i can determine:
     305   * The affected device
     306   * The affected service
     307   * The type of problem being reported
     308   * The duration of the outage
     309
     310== Step 6: Setup contact info and change control procedures ==
     311
     312=== Step 6A: Exogeni operations staff should subscribe to response-team ===
     313
     314''(This is part of GST [gst:3354] item 12.)''
     315
     316'''Using:'''
     317 * Ask ExoGENI operators to subscribe `exogeni-ops@renci.org` (or individual operators) to `response-team@geni.net`
     318
     319'''Verify:'''
     320 * This subscription has happened.  On daulis:
     321{{{
     322sudo -u mailman /usr/lib/mailman/bin/find_member -l response-team exogeni-ops
     323}}}
     324
     325=== Step 6B: Exogeni operations staff should provide contact info to GMOC ===
     326
     327''(This is part of GST [gst:3354] item 12.)''
     328
     329'''Using:'''
     330 * Ask ExoGENI operators to submit primary and secondary e-mail and phone contact information to GMOC
     331
     332'''Verify:'''
     333 * Browse to [https://gmoc-db.grnoc.iu.edu/protected/], login, and look at the "organizations" table.  Make sure either:
     334   * The RENCI contact information is up-to-date and includes exogeni-ops and some reasonable phone numbers
     335   * A new ExoGENI contact has been added
     336
     337=== Step 6C: Negotiate an interim change control notification procedure ===
     338
     339''(This is GST [gst:3354] item 6.)''
     340
     341'''Using:'''
     342 * Ask ExoGENI operators to notify either exogeni-design@geni.net or gpo-infra@geni.net about planned outages and changes.
     343
     344'''Verify:'''
     345 * ExoGENI agrees to send notifications about planned outages and changes.
     346