Version 4 (modified by 11 years ago) (diff) | ,
---|
-
Detailed test plan for EG-MON-5: GMOC Data Collection Test
- Status of test
- High-level description from test plan
- Step 1: download all recent GMOC data for examination
- Step 2: verify AM reachability data
- Step 3: verify rack compute resource reporting
- Step 4: verify ORCA aggregate sliver reporting
- Step 5: verify interface counter reporting
- Step 6: verify end-to-end health check
- Step 7: verify distinct user summarization
Detailed test plan for EG-MON-5: GMOC Data Collection Test
This page is GPO's working page for performing EG-MON-5. It is public for informational purposes, but it is not an official status report. See GENIRacksHome/ExogeniRacks/AcceptanceTestStatus for the current status of ExoGENI acceptance tests.
Last substantive edit of this page: 2013-02-26
Status of test
Step | State | Date completed | Open Tickets | Closed Tickets/Comments |
1A | Color(orange,Blocked)? | 164 | blocked on slow download of GMOC data because of many interfaces; GMOC is working on this | |
1B | ||||
2A | ||||
2B | ||||
3 | ||||
4 | ||||
5 | ||||
6 | ||||
7 |
High-level description from test plan
This test verifies the rack's submission of monitoring data to GMOC.
Procedure
View the dataset collected at GMOC for the BBN and RENCI ExoGENI racks. For each piece of required data, attempt to verify that:
- The data is being collected and accepted by GMOC and can be viewed at gmoc-db.grnoc.iu.edu
- The data's "site" tag indicates that it is being reported for the rack located at the gpolab or RENCI site (as appropriate for that rack).
- The data has been reported within the past 10 minutes.
- For each piece of data, either verify that it is being collected at least once a minute, or verify that it requires more complicated processing than a simple file read to collect, and thus can be collected less often.
Verify that the following pieces of data are being reported:
- Is each of the rack ExoGENI and FOAM AMs reachable via the GENI AM API right now?
- Is each compute or unbound VLAN resource at each rack AM online? Is it available or in use?
- Sliver count and percentage of compute and unbound VLAN resources in use for the rack SM.
- Identities of current slivers on each rack AM, including creation time for each.
- Per-sliver interface counters for compute and VLAN resources (where these values can be easily collected).
- Is the rack data plane switch online?
- Interface counters and VLAN memberships for each rack data plane switch interface
- MAC address table contents for shared VLANs which appear on rack data plane switches
- Is each rack worker node online?
- For each rack worker node configured as an OpenStack VM server, overall CPU, disk, and memory utilization for the host, current VM count and total VM capacity of the host.
- For each rack worker node configured as an OpenStack VM server, interface counters for each data plane interface.
- Results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack.
Verify that per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on the rack, either by providing raw sliver data containing sliver users to GMOC, or by collecting data locally and producing trending summaries on demand.
Step 1: download all recent GMOC data for examination
Step 1A: download GMOC relational data objects for BBN and RCI racks
Overview of Step 1A
Within a python shell on a node which has gmoc.py installed and configured with a valid submission username:
import sys sys.path.append('/usr/local/lib') import gmoc import gmoc_config config = gmoc_config.read_config_file() obsolete_url_suffix = 'xchange/webservice.pl' serviceurl = config['GMOC_REL_URL'] if serviceurl.endswith(obsolete_url_suffix): serviceurl = serviceurl[:-1 * (len(obsolete_url_suffix))] f = open(config['PASSWD_FILE']) gmoc_passwd = f.readline().strip() f.close() client = gmoc.GMOCClient( serviceURL = serviceurl, username = config['SITENAME'], password = gmoc_passwd, ) bbnxo = client.load(gmoc.Aggregate('bbn-hn.exogeni.net:11443')) rcixo = client.load(gmoc.Aggregate('rci-hn.exogeni.net:11443')) bbnfoam = client.load(gmoc.Aggregate('bbn-hn.exogeni.gpolab.bbn.com:3626')) rcifoam = client.load(gmoc.Aggregate('rci-hn.exogeni.net:3626'))
This should yield four gmoc.Aggregate objects.
Step 1B: download recent GMOC measurement data as XML
Overview of Step 1B
Pretty-print the most recent XML measurement data download from GMOC:
import urllib from xml.dom import minidom usock = urllib.urlopen('http://gmoc-db.grnoc.iu.edu/web-services/gen_api.pl') dom = minidom.parse(usock) f = open('recent-gmoc.xml', 'w') f.write('%s\n' % dom.toprettyxml()) f.close()
This should yield an XML file, recent-gmoc.xml
.
Step 2: verify AM reachability data
Step 2A: verify GPO remote testing of AMs
Overview of step 2A
- Look in recent-gmoc.xml for data reported by ashur.gpolab.bbn.com containing these tests:
- geni_am_getversion
- geni_am_listresources for these hosts:
- rci-hn.exogeni.net ORCA
- rci-hn.exogeni.net FOAM
- bbn-hn.exogeni.gpolab.bbn.com ORCA
- bbn-hn.exogeni.gpolab.bbn.com FOAM
- Make sure each test has recent data (some non-null values)
Step 2B: verify aggregate self-reporting of any data
Overview of step 2B
In the python shell:
print rcixo.last_updated print rcifoam.last_updated print bbnxo.last_updated print bbnfoam.last_updated
Each of those dates should be a recent timestamp.
Step 3: verify rack compute resource reporting
Fill in details of how to determine:
- Is each compute or unbound VLAN resource at each rack AM online? Is it available or in use?
- Is each rack worker node online?
- For each rack worker node configured as an OpenStack VM server, overall CPU, disk, and memory utilization for the host, current VM count and total VM capacity of the host.
- Is the rack data plane switch online?
Step 4: verify ORCA aggregate sliver reporting
Fill in details of how to determine:
- Sliver count and percentage of compute and unbound VLAN resources in use for the rack SM.
- Identities of current slivers on each rack AM, including creation time for each.
Step 5: verify interface counter reporting
Fill in details of how to determine:
- Per-sliver interface counters for compute and VLAN resources (where these values can be easily collected).
- Interface counters and VLAN memberships for each rack data plane switch interface
- MAC address table contents for shared VLANs which appear on rack data plane switches
- For each rack worker node configured as an OpenStack VM server, interface counters for each data plane interface.
Step 6: verify end-to-end health check
- Results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack.
Step 7: verify distinct user summarization
Verify that per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on the rack, either by providing raw sliver data containing sliver users to GMOC, or by collecting data locally and producing trending summaries on demand.