Changes between Initial Version and Version 1 of GENIRacksHome/InstageniRacks/AcceptanceTestStatus/IG-MON-5


Ignore:
Timestamp:
02/28/13 07:47:07 (11 years ago)
Author:
chaos@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/InstageniRacks/AcceptanceTestStatus/IG-MON-5

    v1 v1  
     1[[PageOutline]]
     2
     3= Detailed test plan for IG-MON-5: GMOC Data Collection Test =
     4
     5''This page is GPO's working page for performing IG-MON-5.  It is public for informational purposes, but it is not an official status report.  See [wiki:GENIRacksHome/InstageniRacks/AcceptanceTestStatus] for the current status of InstaGENI acceptance tests.''
     6
     7''Last substantive edit of this page: 2013-02-28''
     8
     9== Status of test ==
     10
     11|| '''Step''' || '''State'''                    || '''Date completed''' || '''Open Tickets''' || '''Closed Tickets/Comments'''                                                               ||
     12|| 1A         || [[Color(#63B8FF,In Progress)]] ||                      || [gmocticket:164]   || hindered by slow download of GMOC data because of many interfaces; a workaround is possible ||
     13|| 1B         ||                                ||                      ||                    ||                                                                                             ||
     14|| 2A         ||                                ||                      ||                    ||                                                                                             ||
     15|| 2B         ||                                ||                      ||                    ||                                                                                             ||
     16|| 3          ||                                ||                      ||                    ||                                                                                             ||
     17|| 4          ||                                ||                      ||                    ||                                                                                             ||
     18|| 5          ||                                ||                      ||                    ||                                                                                             ||
     19|| 6          ||                                ||                      ||                    ||                                                                                             ||
     20|| 7          ||                                ||                      ||                    ||                                                                                             ||
     21
     22== High-level description from test plan ==
     23
     24This test verifies the rack's submission of monitoring data to GMOC.
     25
     26=== Procedure ===
     27
     28View the dataset collected at GMOC for the BBN and Utah InstaGENI racks. For each piece of required data, attempt to verify that:
     29
     30 * The data is being collected and accepted by GMOC and can be viewed at gmoc-db.grnoc.iu.edu
     31 * The data's "site" tag indicates that it is being reported for the rack located at the gpolab or Utah site (as appropriate for that rack).
     32 * The data has been reported within the past 10 minutes.
     33 * For each piece of data, either verify that it is being collected at least once a minute, or verify that it requires more complicated processing than a simple file read to collect, and thus can be collected less often.
     34
     35Verify that the following pieces of data are being reported:
     36
     37 * Is each of the rack InstaGENI and FOAM AMs reachable via the GENI AM API right now?
     38 * Is each compute or unbound VLAN resource at each rack AM online? Is it available or in use?
     39 * Sliver count and percentage of rack compute and unbound VLAN resources in use.
     40 * Identities of current slivers on each rack AM, including creation time for each.
     41 * Per-sliver interface counters for compute and VLAN resources (where these values can be easily collected).
     42 * Is the rack data plane switch online?
     43 * Interface counters and VLAN memberships for each rack data plane switch interface
     44 * MAC address table contents for shared VLANs which appear on rack data plane switches
     45 * Is each rack experimental node online?
     46 * For each rack experimental node configured as an OpenVZ VM server, overall CPU, disk, and memory utilization for the host, current VM count and total VM capacity of the host.
     47 * For each rack experimental node configured as an OpenVZ VM server, interface counters for each data plane interface.
     48 * Results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack.
     49
     50Verify that per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on the rack, either by providing raw sliver data containing sliver users to GMOC, or by collecting data locally and producing trending summaries on demand.
     51
     52== Step 1: download all recent GMOC data for examination ==
     53
     54=== Step 1A: download GMOC relational data objects for BBN and Utah racks ===
     55
     56==== Overview of Step 1A ====
     57
     58Within a python shell on a node which has gmoc.py installed and configured with a valid submission username:
     59{{{
     60import sys
     61sys.path.append('/usr/local/lib')
     62import gmoc
     63import gmoc_config
     64
     65config = gmoc_config.read_config_file()
     66obsolete_url_suffix = 'xchange/webservice.pl'
     67serviceurl = config['GMOC_REL_URL']
     68if serviceurl.endswith(obsolete_url_suffix):
     69  serviceurl = serviceurl[:-1 * (len(obsolete_url_suffix))]
     70
     71f = open(config['PASSWD_FILE'])
     72gmoc_passwd = f.readline().strip()
     73f.close()
     74
     75client = gmoc.GMOCClient(
     76           serviceURL = serviceurl,
     77           username = config['SITENAME'],
     78           password = gmoc_passwd,
     79         )
     80
     81bbnig = client.load(gmoc.Aggregate('instageni.gpolab.bbn.com:12369'))
     82uthig = client.load(gmoc.Aggregate('utah.geniracks.net:12369'))
     83bbnfoam = client.load(gmoc.Aggregate('foam.instageni.gpolab.bbn.com:3626'))
     84uthfoam = client.load(gmoc.Aggregate('foam.utah.geniracks.net:3626'))
     85}}}
     86This should yield four gmoc.Aggregate objects.  Record the collection time of each object, in case the rest of the testing is done later.
     87
     88=== Step 1B: download recent GMOC measurement data as XML ===
     89
     90==== Overview of Step 1B ====
     91
     92Pretty-print the most recent XML measurement data download from GMOC:
     93{{{
     94import urllib
     95from xml.dom import minidom
     96usock = urllib.urlopen('http://gmoc-db.grnoc.iu.edu/web-services/gen_api.pl')
     97dom = minidom.parse(usock)
     98f = open('recent-gmoc.xml', 'w')
     99f.write('%s\n' % dom.toprettyxml())
     100f.close()
     101}}}
     102This should yield an XML file, `recent-gmoc.xml`.
     103
     104== Step 2: verify AM reachability data ==
     105
     106=== Step 2A: verify GPO remote testing of AMs ===
     107
     108==== Overview of step 2A ====
     109
     110 * Look in recent-gmoc.xml for data reported by ashur.gpolab.bbn.com containing these tests:
     111   * geni_am_getversion
     112   * geni_am_listresources
     113   for these hosts:
     114   * instageni.gpolab.bbn.com ProtoGENI
     115   * foam.instageni.gpolab.bbn.com FOAM
     116   * utah.geniracks.net ProtoGENI
     117   * foam.utah.geniracks.net FOAM
     118 * Make sure each test has recent data (some non-null values)
     119
     120=== Step 2B: verify aggregate self-reporting of any data ===
     121
     122==== Overview of step 2B ====
     123
     124In the python shell:
     125{{{
     126print uthig.last_updated
     127print uthfoam.last_updated
     128print bbnig.last_updated
     129print bbnfoam.last_updated
     130}}}
     131Each of those dates should be a timestamp which is recent relative to when the data was collected.
     132
     133== Step 3: verify rack compute resource reporting ==
     134
     135Fill in details of how to determine:
     136 * Is each compute or unbound VLAN resource at each rack AM online? Is it available or in use?
     137 * Is each rack experimental node online?
     138 * For each rack experimental node configured as an OpenVZ VM server, overall CPU, disk, and memory utilization for the host, current VM count and total VM capacity of the host.
     139 * Is the rack data plane switch online?
     140
     141== Step 4: verify ProtoGENI aggregate sliver reporting ==
     142
     143Fill in details of how to determine:
     144 * Sliver count and percentage of compute and unbound VLAN resources in use for the rack SM.
     145 * Identities of current slivers on each rack AM, including creation time for each.
     146
     147== Step 5: verify interface counter reporting ==
     148
     149Fill in details of how to determine:
     150 * Per-sliver interface counters for compute and VLAN resources (where these values can be easily collected).
     151 * Interface counters and VLAN memberships for each rack data plane switch interface
     152 * MAC address table contents for shared VLANs which appear on rack data plane switches
     153 * For each rack experimental node configured as an OpenVZ VM server, interface counters for each data plane interface.
     154
     155== Step 6: verify end-to-end health check ==
     156
     157 * Results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack.
     158
     159== Step 7: verify distinct user summarization ==
     160
     161Verify that per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on the rack, either by providing raw sliver data containing sliver users to GMOC, or by collecting data locally and producing trending summaries on demand.
     162