Version 5 (modified by 12 years ago) (diff) | ,
---|
-
Detailed test plan for EG-ADM-1: Rack Receipt and Inventory Test
- Page format
- Status of test
- High-level description from test plan
- Step 1 (prep): ExoGENI and GPO power and wire the BBN rack
- Step 2: Configure and verify DNS
-
Step 3: GPO requests and receives administrator accounts
- Step 3A: GPO requests access to head node
- Step 3B: GPO requests access to network devices
- Step 3C: GPO requests access to worker nodes running under OpenStack
- Step 3D: GPO requests access to IPMI management interfaces for workers
- Step 3E: GPO gets access to allocated bare metal worker nodes by default
- Step 4: GPO inventories the rack based on our own processes
- Step 5: Configure operational alerting for the rack
- Step 6: Setup contact info and change control procedures
Detailed test plan for EG-ADM-1: Rack Receipt and Inventory Test
This page is GPO's working page for performing EG-ADM-1. It is public for informational purposes, but it is not an official status report. See GENIRacksHome/ExogeniRacks/AcceptanceTestStatus for the current status of ExoGENI acceptance tests.
Last substantive edit of this page: 2012-05-04
Page format
- The status chart summarizes the state of this test
- The high-level description from test plan contains text copied exactly from the public test plan and acceptance criteria pages.
- The steps contain things i will actually do/verify:
- Steps may be composed of related substeps where i find this useful for clarity
- Each step is identified as either "(prep)" or "(verify)":
- Prep steps are just things we have to do. They're not tests of the rack, but are prerequisites for subsequent verification steps
- Verify steps are steps in which we will actually look at rack output and make sure it is as expected. They contain a Using: block, which lists the steps to run the verification, and an Expect: block which lists what outcome is expected for the test to pass.
Status of test
Meaning of states:
- Color(green,okay)?: Step is completed and passed (for a verification step), or is completed (for a prep step)
- Color(red,failed)?: Step is completed and failed, and is not being revisited
- in progress: We are currently testing or iterating on this step
- Color(orange,waiting)?: Step is blocked by some other step or activity
Step | State | Date completed | Tickets | Comments |
1 | Color(green,Pass)? | 2012-02-24 | ||
2A | Color(orange,Blocked)? | exoticket:11 | blocked on a full IP-to-hostname mapping for the subnet | |
2B | Color(orange,Blocked)? | blocked on 2A | ||
2C | Color(orange,Blocked)? | blocked on 2B | ||
3A | ready to test | |||
3B | Color(orange,Blocked)? | exoticket:10 | blocked on access to rack switches | |
3C | ready to test | |||
3D | ready to test | |||
3E | Color(orange,Blocked)? | blocked on a working bare mental node implementation (vjo) | ||
4A | ready to test | |||
4B | ready to test | |||
4C | ready to test | |||
4D | Color(orange,Blocked)? | exoticket:12 | blocked on government property form | |
5A | ready to test | |||
5B | Tim is working with ExoGENI to get the vlan 1750 interface setup | |||
5C | ready to test | |||
5D | ready to test | |||
6A | ready to test | |||
6B | ready to test | |||
6C | ready to test |
High-level description from test plan
This "test" uses BBN as an example site by verifying that we can do all the things we need to do to integrate the rack into our standard local procedures for systems we host.
Procedure
- ExoGENI and GPO power and wire the BBN rack
- GPO configures the exogeni.gpolab.bbn.com DNS namespace and 192.1.242.0/25 IP space, and enters all public IP addresses for the BBN rack into DNS.
- GPO requests and receives administrator accounts on the rack and read access to ExoGENI Nagios for GPO sysadmins.
- GPO inventories the physical rack contents, network connections and VLAN configuration, and power connectivity, using our standard operational inventories.
- GPO, ExoGENI, and GMOC share information about contact information and change control procedures, and ExoGENI operators subscribe to GENI operations mailing lists and submit their contact information to GMOC.
Criteria to verify as part of this test
- VI.02. A public document contains a parts list for each rack. (F.1)
- VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
- VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
- VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
- VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
- VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with Legal, Law Enforcement and Regulatory(LLR) Plan, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
- VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
- VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
- VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)
Step 1 (prep): ExoGENI and GPO power and wire the BBN rack
This was done on 2012-02-23 and 2012-02-24, and Chaos took rough notes at ChaosSandbox/ExogeniRackNotes.
Step 2: Configure and verify DNS
(This is GST 3354 item 5.)
Step 2A (verify): Find out what IP-to-hostname mapping to use
Using:
- If the rack IP requirements documentation for the rack exists:
- Review that documentation and determine what IP to hostname mappings should exist for
192.1.242.0/25
- Review that documentation and determine what IP to hostname mappings should exist for
- Otherwise:
- Iterate with
exogeni-ops
to determine the IP to hostname mappings to use for192.1.242.0/25
- Iterate with
Expect:
- Reasonable IP-to-hostname mappings for 126 valid IPs allocated for ExoGENI use in
192.1.242.0/25
Step 2B (prep): Insert IP-to-hostname mapping in DNS
- Fully populate
192.1.242.0/25
PTR entries in GPO lab DNS - Fully populate
exogeni.gpolab.bbn.com
PTR entries in GPO lab DNS
Step 2C (verify): Test all PTR records
Using:
- From a BBN desktop host:
for lastoct in {1..127}; do host 192.1.242.$lastoct done
Expect:
- All results look like:
$lastoct.242.1.192.in-addr.arpa domain name pointer <something reasonable>
and none look like:Host $lastoct.242.1.192.in-addr.arpa. not found: 3(NXDOMAIN)
Step 3: GPO requests and receives administrator accounts
Step 3A: GPO requests access to head node
(This is GST 3354 item 2a.)
Using:
- Request accounts for GPO ops staffers on bbn-hn.exogeni.gpolab.bbn.com
- Chaos tries to SSH to chaos@bbn-hn.exogeni.gpolab.bbn.com
- Josh tries to SSH to jbs@bbn-hn.exogeni.gpolab.bbn.com
- Tim tries to SSH to tupty@bbn-hn.exogeni.gpolab.bbn.com
- Chaos tries to run a minimal command as sudo:
sudo whoami
Verify:
- Logins succeed for Chaos, Josh, and Tim
- The command works:
$ sudo whoami root
Step 3B: GPO requests access to network devices
(This is GST 3354 item 2f.)
Using:
- Request accounts for GPO ops staffers on network devices 8052.bbn.xo (management) and 8264.bbn.xo (dataplane) from exogeni-ops
Verify:
- I know what hostname or IP address to login to to reach each of the 8052 and 8264 switches
- I know where to login to each of 8052 and 8264
- I can successfully perform those logins at least once
- I can successfully run a few test commands to verify enable mode:
show running-config show mac-address-table
Step 3C: GPO requests access to worker nodes running under OpenStack
(This is GST 3354 item 2c.)
Using:
- From bbn-hn, try to SSH to bbn-w1
- From bbn-hn, try to SSH to bbn-w2
- From bbn-hn, try to SSH to bbn-w3
- From bbn-hn, try to SSH to bbn-w4
Verify:
- For each connection, either the connection succeeds or we can verify that the node is not an OpenStack worker.
Step 3D: GPO requests access to IPMI management interfaces for workers
(This is GST 3354 item 2b.)
Using:
- GPO requests VPN access to the worker node IPMI management interfaces from exogeni-ops
- From my laptop, connect a VPN using the exogeni-vpn bundle configuration and credentials
- Once connected, browse to each of:
Verify:
- VPN connection succeeds
- Login to each IMM succeeds
- Launching the remote console at each IMM succeeds
Step 3E: GPO gets access to allocated bare metal worker nodes by default
(This is GST 3354 item 2d.)
Prerequisites:
- A bare metal node is available for allocation by xCAT
- Someone has successfully allocated the node for a bare metal experiment
Using:
- From bbn-hn, try to SSH into root on the allocated worker node
Verify:
- We find out the IP address/hostname at which to reach the allocated worker node
- We find out the location of the SSH private key on bbn-hn
- Login using this SSH key succeeds.
Step 4: GPO inventories the rack based on our own processes
Step 4A: Inventory and label physical rack contents
(This covers GST 3354 items 3 and 7.)
Using:
- Enumerate all physical objects in the rack
- Use https://wiki.exogeni.net/doku.php?id=public:hardware:rack_layout to determine the name of each object
- If any objects can't be found there, compare to ChaosSandbox/ExogeniRackNotes, and iterate with RENCI
- Physically label each device in the rack with its name on front and back
- Inventory all hardware details for rack contents on OpsHardwareInventory
- Add an ascii rack diagram to OpsHardwareInventory
Verify:
- https://wiki.exogeni.net/doku.php?id=public:hardware:rack_layout contains all devices in the rack
- There is a public parts list which matches the parts we received
- We succeed in labelling the devices and adding their hardware details and locations to our inventory
Step 4B: Inventory rack power requirements
Using:
- Add rack circuit information to OpsPowerConnectionInventory
Verify:
- We succeed in locating and documenting information about rack power circuits in use
Step 4C: Inventory rack network connections
Using:
- Add all rack ethernet and fiber connections and their VLAN configurations to OpsConnectionInventory
- Add static rack OpenFlow datapath information to OpsDpidInventory
Verify:
- We are able to identify and determine all rack network connections and VLAN configurations
- We are able to determine the OpenFlow configuration of the rack dataplane switch
Step 4D: Verify government property accounting for the rack
(This is GST 3354 item 11.)
Using:
- Receive a completed DD1149 form from RENCI
- Receive and inventory a property tag number for the BBN ExoGENI rack
Verify:
- The DD1149 paperwork is complete to BBN government property standards
- We receive a single property tag for the rack, as expected
Step 5: Configure operational alerting for the rack
Step 5A: GPO installs active control network monitoring
(This is GST 3354 item 8.)
Using:
- Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.2
- Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.3
- Add a monitored control network ping from ilian.gpolab.bbn.com to 192.1.242.4
Verify:
- Active monitoring of the control network is successful
- Each monitored IPs is successfully available at least once
Step 5B: GPO installs active shared dataplane monitoring
(This is GST 3354 item 9.)
Using:
- Add a monitored dataplane network ping from a lab dataplane test host on vlan 1750 to the rack dataplane
- If necessary, add an openflow controller to handle traffic for the monitoring subnet
Verify:
- Active monitoring of the dataplane network is successful
- The monitored IP is successfully available at least once
Step 5C: GPO gets access to nagios information about the BBN rack
(This is part of GST 3354 item 10.)
Using:
- Browse to https://bbn-hn.exogeni.net/rack_bbn/
- Login using LDAP credentials
Verify:
- Login succeeds
- I can see a number of types of devices
- I can click on a problem report and verify its details
Step 5D: GPO receives e-mail about BBN rack nagios alerts
(This is part of GST 3354 item 10.)
Using:
- Request e-mail notifications for BBN rack nagios to be sent to GPO ops
- Collect a number of notifications
- Inspect three representative messages
Verify:
- E-mail messages about rack nagios are received
- For each inspected message, i can determine:
- The affected device
- The affected service
- The type of problem being reported
- The duration of the outage
Step 6: Setup contact info and change control procedures
Step 6A: Exogeni operations staff should subscribe to response-team
(This is part of GST 3354 item 12.)
Using:
- Ask ExoGENI operators to subscribe
exogeni-ops@renci.org
(or individual operators) toresponse-team@geni.net
Verify:
- This subscription has happened. On daulis:
sudo -u mailman /usr/lib/mailman/bin/find_member -l response-team exogeni-ops
Step 6B: Exogeni operations staff should provide contact info to GMOC
(This is part of GST 3354 item 12.)
Using:
- Ask ExoGENI operators to submit primary and secondary e-mail and phone contact information to GMOC
Verify:
- Browse to https://gmoc-db.grnoc.iu.edu/protected/, login, and look at the "organizations" table. Make sure either:
- The RENCI contact information is up-to-date and includes exogeni-ops and some reasonable phone numbers
- A new ExoGENI contact has been added
Step 6C: Negotiate an interim change control notification procedure
(This is GST 3354 item 6.)
Using:
- Ask ExoGENI operators to notify either exogeni-design@geni.net or gpo-infra@geni.net about planned outages and changes.
Verify:
- ExoGENI agrees to send notifications about planned outages and changes.