Changes between Initial Version and Version 1 of GENIRacksHome/InstageniRacks/AcceptanceTestStatus/IG-ADM-1


Ignore:
Timestamp:
05/07/12 19:15:01 (12 years ago)
Author:
chaos@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/InstageniRacks/AcceptanceTestStatus/IG-ADM-1

    v1 v1  
     1[[PageOutline]]
     2
     3= Detailed test plan for IG-ADM-1: Rack Receipt and Inventory Test =
     4
     5''This page is GPO's working page for performing IG-ADM-1.  It is public for informational purposes, but it is not an official status report.  See [wiki:GENIRacksHome/InstageniRacks/AcceptanceTestStatus] for the current status of InstaGENI acceptance tests.''
     6
     7''Last substantive edit of this page: 2012-05-07''
     8
     9== Page format ==
     10
     11 * The status chart summarizes the state of this test
     12 * The high-level description from test plan contains text copied exactly from the public test plan and acceptance criteria pages.
     13 * The steps contain things i will actually do/verify:
     14   * Steps may be composed of related substeps where i find this useful for clarity
     15   * Each step is identified as either "(prep)" or "(verify)":
     16     * Prep steps are just things we have to do.  They're not tests of the rack, but are prerequisites for subsequent verification steps
     17     * Verify steps are steps in which we will actually look at rack output and make sure it is as expected.  They contain a '''Using:''' block, which lists the steps to run the verification, and an '''Expect:''' block which lists what outcome is expected for the test to pass.
     18
     19== Status of test ==
     20
     21See [wiki:GENIRacksHome/InstageniRacks/AcceptanceTestStatus] for the meanings of test states.
     22
     23''Note: all steps of this test are blocked on the arrival of the BBN rack at BBN.  However, we plan to do preliminary testing of some steps using the rack at Utah.  For the time being, we need to differentiate between steps which are blocked until the BBN rack arrives, and steps which may be blocked from testing at Utah by some shorter-term requirement, as follows:''
     24 * [[Color(yellow,Blocked-site)]]: A step which will not be tested on the Utah rack, and is blocked until the BBN site rack arrives.
     25 * [[Color(orange,Blocked-Utah)]]: A step which will be tested on the Utah rack, and is blocked on a requirement for access or configuration of the Utah rack.
     26
     27|| '''Step''' || '''State'''                    || '''Date completed''' || '''Tickets''' || '''Comments'''                                                                 ||
     28|| 1          || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on purchase and shipping of BBN rack                                   ||
     29|| 2A         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 1                                                                   ||
     30|| 2B         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 2A                                                                  ||
     31|| 2C         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 2B                                                                  ||
     32|| 3A         ||                                ||                      ||               || ready for preliminary testing using the Utah rack                              ||
     33|| 3B         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to the FOAM VM                                           ||
     34|| 3C         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to the infrastructure VM server node                     ||
     35|| 3D         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to the control and dataplane switches                    ||
     36|| 3E         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to OpenVZ (maybe just on sudo access to boss?)           ||
     37|| 3F         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to rack iLO                                              ||
     38|| 3G         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO sudo access on rack boss                                        ||
     39|| 4A         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 1                                                                   ||
     40|| 4B         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 1                                                                   ||
     41|| 4C         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on GPO access to control and dataplane switches                        ||
     42|| 4D         || [[Color(yellow,Blocked-site)]] ||                      ||               || blocked on 1                                                                   ||
     43|| 5A         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on public IPs/hostnames for Utah infrastructure devices                ||
     44|| 5B         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on a rack dataplane vlan being connected to the GENI mesoscale network ||
     45|| 5C         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on information about rack monitoring solution                          ||
     46|| 5D         || [[Color(orange,Blocked-Utah)]] ||                      ||               || blocked on information about rack monitoring solution                          ||
     47|| 6A         ||                                ||                      ||               ||                                                                                ||
     48|| 6B         ||                                ||                      ||               ||                                                                                ||
     49|| 6C         ||                                ||                      ||               ||                                                                                ||
     50
     51== High-level description from test plan ==
     52
     53This "test" uses BBN as an example site by verifying that we can do all the things we need to do to integrate the rack into our standard local procedures for systems we host.
     54
     55=== Procedure ===
     56
     57 * InstaGENI and GPO power and wire the BBN rack
     58 * GPO configures the instageni.gpolab.bbn.com DNS namespace and 192.1.242.128/25 IP space, and enters all public IP addresses used by the rack into DNS.
     59 * GPO requests and receives administrator accounts on the rack and GPO sysadmins receive read access to all InstaGENI monitoring of the rack.
     60 * GPO inventories the physical rack contents, network connections and VLAN configuration, and power connectivity, using our standard operational inventories.
     61 * GPO, InstaGENI, and GMOC share information about contact information and change control procedures, and InstaGENI operators subscribe to GENI operations mailing lists and submit their contact information to GMOC.
     62 * GPO reviews the documented parts list, power requirements, physical and logical network connectivity requirements, and site administrator community requirements, verifying that these documents should be sufficient for a new site to use when setting up a rack.
     63
     64=== Criteria to verify as part of this test ===
     65
     66 * VI.02. A public document contains a parts list for each rack. (F.1)
     67 * VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
     68 * VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
     69 * VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
     70 * VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
     71 * VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with Legal, Law Enforcement and Regulatory(LLR) Plan, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
     72 * VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
     73 * VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
     74 * VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)
     75
     76== Step 1 (prep): InstaGENI and GPO power and wire the BBN rack ==
     77
     78This step covers the physical delivery of the rack to BBN, the transport of the rack inside BBN to the GPO lab, and the cabling, powering, and initial configuration of the rack.
     79
     80== Step 2: Configure and verify DNS ==
     81
     82=== Step 2A (verify): Find out what IP-to-hostname mapping to use ===
     83
     84'''Using:'''
     85 * If the rack IP requirements documentation for the rack exists:
     86   * Review that documentation and determine what IP to hostname mappings should exist for `192.1.242.128/25`
     87 * Otherwise:
     88   * Iterate with `instageni-ops` to determine the IP to hostname mappings to use for `192.1.242.128/25`
     89
     90'''Expect:'''
     91 * Reasonable IP-to-hostname mappings for 126 valid IPs allocated for InstaGENI use in `192.1.242.128/25`
     92
     93=== Step 2B (prep): Insert IP-to-hostname mapping in DNS ===
     94
     95 * Fully populate `192.1.242.128/25` PTR entries in GPO lab DNS
     96 * Fully populate `instageni.gpolab.bbn.com` A entries in GPO lab DNS
     97
     98=== Step 2C (verify): Test all PTR records ===
     99
     100'''Using:'''
     101 * From a BBN desktop host:
     102{{{
     103for lastoct in {129..255}; do
     104host 192.1.242.$lastoct
     105done
     106}}}
     107
     108'''Expect:'''
     109 * All results look like:
     110{{{
     111$lastoct.242.1.192.in-addr.arpa domain name pointer <something reasonable>
     112}}}
     113 and none look like:
     114{{{
     115Host $lastoct.242.1.192.in-addr.arpa. not found: 3(NXDOMAIN)
     116}}}
     117
     118== Step 3: GPO requests and receives administrator accounts ==
     119
     120=== Step 3A: GPO requests access to boss and ops nodes ===
     121
     122'''Using:'''
     123 * Request accounts for GPO ops staffers on boss.instageni.gpolab.bbn.com and ops.instageni.gpolab.bbn.com
     124 * Chaos tries to SSH to chaos@boss.instageni.gpolab.bbn.com
     125 * Josh tries to SSH to jbs@boss.instageni.gpolab.bbn.com
     126 * Tim tries to SSH to tupty@boss.instageni.gpolab.bbn.com
     127 * Chaos tries to SSH to chaos@ops.instageni.gpolab.bbn.com
     128 * Josh tries to SSH to jbs@ops.instageni.gpolab.bbn.com
     129 * Tim tries to SSH to tupty@ops.instageni.gpolab.bbn.com
     130 * Chaos tries to run a minimal command as sudo on boss:
     131{{{
     132sudo whoami
     133}}}
     134 * Chaos tries to run a minimal command as sudo on ops:
     135{{{
     136sudo whoami
     137}}}
     138
     139'''Verify:'''
     140 * Logins succeed for Chaos, Josh, and Tim on both nodes
     141 * The commands work:
     142{{{
     143$ sudo whoami
     144root
     145}}}
     146
     147=== Step 3B: GPO requests access to FOAM VM ===
     148
     149 * Request accounts for GPO ops staffers on foam.instageni.gpolab.bbn.com
     150 * Chaos tries to SSH to chaos@foam.instageni.gpolab.bbn.com
     151 * Josh tries to SSH to jbs@foam.instageni.gpolab.bbn.com
     152 * Tim tries to SSH to tupty@foam.instageni.gpolab.bbn.com
     153 * Chaos tries to run a minimal command as sudo on foam:
     154{{{
     155sudo whoami
     156}}}
     157
     158'''Verify:'''
     159 * Logins succeed for Chaos, Josh, and Tim on the FOAM VM
     160 * The command works:
     161{{{
     162$ sudo whoami
     163root
     164}}}
     165
     166=== Step 3C: GPO requests access to infrastructure server ===
     167
     168 * Request accounts for GPO ops staffers on the VM server node which runs boss, ops, and foam
     169 * Chaos tries to SSH to the VM server node
     170 * Josh tries to SSH to the VM server node
     171 * Tim tries to SSH to the VM server node
     172 * Chaos tries to run a minimal command as sudo on the VM server node:
     173{{{
     174sudo whoami
     175}}}
     176
     177'''Verify:'''
     178 * Logins succeed for Chaos, Josh, and Tim on the host
     179 * The command works:
     180{{{
     181$ sudo whoami
     182root
     183}}}
     184
     185=== Step 3D: GPO requests access to network devices ===
     186
     187'''Using:'''
     188 * Request accounts for GPO ops staffers on the InstaGENI rack control and dataplane network devices from instageni-ops
     189
     190'''Verify:'''
     191 * I know what hostname or IP address to login to to reach each of the control and dataplane switches
     192 * I know what source IPs are allowed to remotely access the control and dataplane switches via SSH
     193 * I can successfully perform those logins at least once
     194 * I can successfully run a few test commands to verify enable mode:
     195{{{
     196show running-config
     197show mac-address-table
     198}}}
     199
     200=== Step 3E: GPO requests access to shared OpenVZ nodes ===
     201
     202'''Using:'''
     203 * Determine an experimental host which is currently configured as a shared OpenVZ node
     204 * From boss.instageni.gpolab.bbn.com, try to SSH to the node
     205 * On the node, try to run a minimal command as sudo:
     206{{{
     207sudo whoami
     208}}}
     209
     210'''Verify:'''
     211 * Login to the OpenVZ host is successful
     212 * Access to the node is as root and/or it is possible to run a command via sudo
     213
     214=== Step 3F: GPO requests access to iLO remote management interfaces for experimental nodes ===
     215
     216'''Using:'''
     217 * GPO requests access to the experimental node iLO management interfaces from instageni-ops
     218 * Determine how to use these interfaces to access remote control and remote console interfaces for experimental nodes
     219 * For each experimental node in the BBN rack:
     220   * Access the iLO interface and view status information
     221   * View the interface for remotely power-cycling the node
     222   * Launch the remote console for the node
     223
     224'''Verify:'''
     225 * GPO is able to determine the procedure for accessing the iLO interfaces
     226 * Login to each iLO succeeds
     227 * The remote power-cycle interface exists and appears to be usable (don't try power-cycling any nodes during this test)
     228 * Launching the remote console at each iLO succeeds
     229
     230=== Step 3G: GPO gets access to allocated bare metal nodes by default ===
     231
     232'''Prerequisites:'''
     233 * A bare metal node is available for allocation by InstaGENI
     234 * Someone has successfully allocated the node for a bare metal experiment
     235
     236'''Using:'''
     237 * From boss, try to SSH into root on the allocated worker node
     238
     239'''Verify:'''
     240 * We find out the IP address/hostname at which to reach the allocated worker node
     241 * Login to the node using root's SSH key succeeds
     242
     243== Step 4: GPO inventories the rack based on our own processes ==
     244
     245=== Step 4A: Inventory and label physical rack contents ===
     246
     247'''Using:'''
     248 * Enumerate all physical objects in the rack
     249 * Use available rack documentation to determine the correct name of each object
     250 * If any objects can't be found in public documentation, compare to internal notes, and iterate with InstaGENI
     251 * Physically label each device in the rack with its name on front and back
     252 * Inventory all hardware details for rack contents on OpsHardwareInventory
     253 * Add an ascii rack diagram to OpsHardwareInventory
     254
     255'''Verify:'''
     256 * Public documentation and/or rack diagrams identify all rack objects
     257 * There is a public parts list which matches the parts we received
     258 * We succeed in labelling the devices and adding their hardware details and locations to our inventory
     259
     260=== Step 4B: Inventory rack power requirements ===
     261
     262'''Using:'''
     263 * Add rack circuit information to OpsPowerConnectionInventory
     264
     265'''Verify:'''
     266 * We succeed in locating and documenting information about rack power circuits in use
     267
     268=== Step 4C: Inventory rack network connections ===
     269
     270'''Using:'''
     271 * Add all rack ethernet and fiber connections and their VLAN configurations to OpsConnectionInventory
     272 * Add static rack OpenFlow datapath information to OpsDpidInventory
     273
     274'''Verify:'''
     275 * We are able to identify and determine all rack network connections and VLAN configurations
     276 * We are able to determine the OpenFlow configuration of the rack dataplane switch
     277
     278=== Step 4D: Verify government property accounting for the rack ===
     279
     280'''Using:'''
     281 * Receive a completed DD1149 form from InstaGENI
     282 * Receive and inventory a property tag number for the BBN InstaGENI rack
     283
     284'''Verify:'''
     285 * The DD1149 paperwork is complete to BBN government property standards
     286 * We receive a single property tag for the rack, as expected
     287
     288== Step 5: Configure operational alerting for the rack ==
     289
     290=== Step 5A: GPO installs active control network monitoring ===
     291
     292'''Using:'''
     293 * Add a monitored control network ping from ilian.gpolab.bbn.com to the boss VM
     294 * Add a monitored control network ping from ilian.gpolab.bbn.com to the ops VM
     295 * Add a monitored control network ping from ilian.gpolab.bbn.com to the foam VM
     296 * Add a monitored control network ping from ilian.gpolab.bbn.com to the infrastructure VM host
     297 * Add a monitored control network ping from ilian.gpolab.bbn.com to the control switch's management IP
     298 * Add a monitored control network ping from ilian.gpolab.bbn.com to the dataplane switch's management IP
     299
     300'''Verify:'''
     301 * Active monitoring of the control network is successful
     302 * Each monitored IPs is successfully available at least once
     303
     304=== Step 5B: GPO installs active shared dataplane monitoring ===
     305
     306'''Using:'''
     307 * Add a monitored dataplane network ping from a lab dataplane test host on vlan 1750 to the rack dataplane
     308 * If necessary, add an openflow controller to handle traffic for the monitoring subnet
     309
     310'''Verify:'''
     311 * Active monitoring of the dataplane network is successful
     312 * The monitored IP is successfully available at least once
     313
     314=== Step 5C: GPO gets access to monitoring information about the BBN rack ===
     315
     316'''Using:'''
     317 * GPO determines what monitoring tool InstaGENI will make available for site administrators
     318 * GPO successfully accesses and views status data about the BBN rack
     319
     320'''Verify:'''
     321 * I can see general data about all devices in the BBN rack
     322 * I can see detailed information about any services checked
     323
     324=== Step 5D: GPO receives e-mail about BBN rack alerts ===
     325
     326'''Using:'''
     327 * Request e-mail notifications for BBN rack problems to be sent to GPO ops
     328 * Collect a number of notifications
     329 * Inspect three representative messages
     330
     331'''Verify:'''
     332 * E-mail messages about rack problems are received
     333 * For each inspected message, i can determine:
     334   * The affected device
     335   * The affected service
     336   * The type of problem being reported
     337   * The duration of the outage
     338
     339== Step 6: Setup contact info and change control procedures ==
     340
     341=== Step 6A: InstaGENI operations staff should subscribe to response-team ===
     342
     343'''Using:'''
     344 * Ask InstaGENI operators to subscribe `instageni-ops@flux.utah.edu` (or individual operators) to `response-team@geni.net`
     345
     346'''Verify:'''
     347 * This subscription has happened.  On daulis:
     348{{{
     349sudo -u mailman /usr/lib/mailman/bin/find_member -l response-team utah.edu
     350}}}
     351
     352=== Step 6B: InstaGENI operations staff should provide contact info to GMOC ===
     353
     354'''Using:'''
     355 * Ask InstaGENI operators to submit primary and secondary e-mail and phone contact information to GMOC
     356
     357'''Verify:'''
     358 * Browse to [https://gmoc-db.grnoc.iu.edu/protected/], login, and look at the "organizations" table.  Make sure either:
     359   * The Utah contact information is up-to-date and includes instageni-ops and some reasonable phone numbers
     360   * A new InstaGENI contact has been added
     361
     362=== Step 6C: Negotiate an interim change control notification procedure ===
     363
     364'''Using:'''
     365 * Ask InstaGENI operators to notify either instageni-design@geni.net or gpo-infra@geni.net about planned outages and changes.
     366
     367'''Verify:'''
     368 * InstaGENI agrees to send notifications about planned outages and changes.
     369