Changes between Initial Version and Version 1 of GENIRacksHome/ExogeniRacks/PreliminaryAcceptanceTestReport


Ignore:
Timestamp:
08/15/12 11:12:02 (12 years ago)
Author:
lnevers@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIRacksHome/ExogeniRacks/PreliminaryAcceptanceTestReport

    v1 v1  
     1[[PageOutline]]
     2
     3= ExoGENI Preliminary Acceptance Test Report =
     4
     5This page captures the Acceptance Test findings as of August 15, 2012 for the ExoGENI rack Project as described in the 
     6[ggw:GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan ExoGENI Acceptance Test Plan] page. For individual test status see the [http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus ExoGENI Acceptance Test Status] page.
     7
     8
     9= Experimenter Test Case Status =
     10
     11This section provides a summary of major features that have been validated as well as the current list of outstanding issues for the Experimenter test cases.
     12
     13== Functionality verified ==
     14
     15Significant progress has been made for the Experimenter test cases, the following features have been validated:
     16 * Support for at least one bare metal compute resource in each rack.
     17 * Support for VM and bare metal compute resources simultaneously in a single rack.
     18 * Support for !OpenFlow intra-rack and inter-rack.
     19 * Support for External OF and non-OF VLAN connection through the rack.
     20 * Shared and Dynamic VLAN support (intra-rack and between racks).
     21 * Meso-scale Interoperability
     22 * Federation with GPO PG.
     23 * GENI V3 RSpec Support.
     24 
     25== Outstanding issues ==
     26
     27Various issues were found in running the experimenter test cases. Most have been addressed, and some remain and are categorized below:
     28
     29=== High impact ===
     30
     31These issues have a high impact on the ability to support experimenters in the GENI environment.
     32
     33 * There are no sliver counts nor resource counts for the ExoGENI slices available to experimenters. (All EG-EXP)
     34
     35=== Medium impact ===
     36
     37These issues significantly affect functions requested by GENI experimenters, but are not needed for Spiral 4-scale deployment.
     38
     39 * Microsoft Windows support on bare metal nodes is not available. Microsoft Windows support on a VM has been discussed, but is also not available. (EG-EXP-1)
     40 * There is no custom OS image support for Experimenters. (EG-EXP-2)
     41 * Bare metal nodes only support one Operating System version at this time, CentOS release 6.3, which is provided by the ExoGENI Team. (EG-EXP-1)
     42 * Some provisioning problems may occur during the provisioning of the bare metal nodes, where the node is not fully configured. (EG-EXP-1)
     43 * Availability of documented procedure for the modification of compute resources allocation in a rack for the addition of bare metal nodes is not available and not planned at this time. (EG-EXP-1)
     44 * Unable to successfully request large numbers of VMs. Various issues have come up, with the most common failures occurring during the resource provisioning phase for both Compute Resources and VLAN. (EG-EXP-3)
     45
     46=== Low impact ===
     47
     48These issues should be resolved and documented, but racks can begin operations without them.
     49
     50 * OS Images availability is not part of the advertisement RSpec. (All EG-EXP)
     51 * Default expectation that a client_id in the Manifest RSpec matches the the client_id in the Request RSpec is not met.  AM API Acceptance test must be modified to expect unbound resources, with this change in place, there is compliance.
     52 * There is no monitoring support for bare metal nodes available for experimenter.  This should be documented for both operational and I&M monitoring impact.  All EG-EXP)
     53 * Image Playpen system planned but not yet available.
     54
     55= Administration and Monitoring Test Cases =
     56
     57This section summarizes status and outstanding issues for the ExoGENI rack administration and monitoring tests.  In particular, this report covers the EG-ADM and EG-MON tests, as outlined in [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan the acceptance test plan].
     58
     59== Functionality verified ==
     60
     61 * LDAP credentials allow site admin access to head node, worker nodes, switches, nagios, and wiki (EG-ADM-1)
     62 * Nagios provides rack status information and alerting (EG-ADM-1)
     63 * Interim change/outage notification via mailing lists works (EG-ADM-1)
     64 * Site inventory of parts and connectivity using our standard site processes was possible (though laborious) (EG-ADM-1)
     65 * Rack control network remote access security seems reasonable provided the head node and SSG5 implement network separation well (EG-ADM-2)
     66 * Site admins can get !OpenFlow state information from the switch, FlowVisor, and FOAM (EG-MON-2)
     67 * Site admins can inspect ORCA aggregate state and resources on the rack (EG-MON-2)
     68
     69== Outstanding issues ==
     70
     71This section describes problems found in testing, for which a fix is still in progress.
     72
     73=== High impact ===
     74
     75These issues risk security and stability problems for host sites.
     76
     77 * The VLAN-based separation of control networks, enforced by bbn-hn and bbn-ssg, still need to be fully documented, implemented, and verified.  This is additionally concerning because of the complexity of the control network's topology. (EG-ADM-2)
     78
     79 * Both public and private documentation has been reviewed and overall some documentation was found, but most is still not available or incomplete.  For details, see the [http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-7 EG-ADM-7 status] page.
     80
     81=== Moderate impact ===
     82
     83These issues cause substantial annoyance for site administrators and make day-to-day troubleshooting more difficult.
     84
     85 * DNS is not fully defined for all public and private IPs.  Private DNS in particular has been brittle: it breaks frequently, and includes only a fraction of addresses in use. (EG-ADM-1)
     86 * ExoGENI offered SVN/rancid switch configuration polling as an alternative to privileged site admin access to switches.  This functionality would be acceptable if it worked, but has been partially broken for some time. (EG-ADM-1)
     87 * Site admin access policies (e.g. what level of access site admins should expect to a particular service or host) are not well-defined, and bitrot of site admin access is common (i.e. when we need to access something which worked at one time, we often find that it no longer works). (EG-ADM-1)
     88
     89=== Infrequent impact ===
     90
     91These issues should be resolved so that site administrators can get information they need, especially since information may be needed in a hurry to investigate a security or stability problem.  However, asking RENCI is an acceptable short-term alternative to self-service capability, provided RENCI is responsive.
     92
     93 * We have not yet been able to locate sources for all software running on bbn-hn.  More iteration is needed on both sides on this. (EG-MON-1)
     94
     95= Test Case Status =
     96
     97== Test cases in progress ==
     98As of 2012-08-14, the following are in progress:
     99
     100 * Verification that VLANs and MAC addresses on the control switch are as expected (EG-MON-1)
     101 * Verification that site admins can get information about active and recently-terminated experiments on the rack (EG-MON-3)
     102 * Verification of a multi-site, multi-experiment, multi-experimenter, multi-subnet, multi-controller, mesoscale-interoperability !OpenFlow scenario (EG-EXP-6)
     103
     104== Tests which have not been run ==
     105
     106=== Tests not run due to stability concerns: moderate impact ===
     107
     108GPO did not run some tests because of concerns that they would put the rack into a bad state and be time-consuming to debug.  These are tests of the rack's response to likely outage and failure modes, and we would like to verify that the rack is stable enough to handle these conditions:
     109 * Rack reboot test (EG-ADM-3)
     110 * Control network disconnection test (EG-ADM-6)
     111
     112=== Tests not run due to higher-priority testing: low impact ===
     113
     114GPO did not run some tests because the benefit during this spiral is low, or because prerequisites are not met.  We will revisit this functionality in Spiral 5.
     115
     116 * Software update test (EG-ADM-5)
     117 * Infrastructure device performance test (EG-MON-4)
     118
     119=== Tests to be run before GEC15 ===
     120
     121These tests had not yet been run because prerequisites were not met, but they have been discussed, and will be run in the GEC14-GEC15 period.
     122
     123 * Emergency Stop test (EG-ADM-4)
     124 * GMOC data collection test (EG-MON-5)
     125
     126
     127
     128
     129= Test Case Descriptions and Status =
     130
     131The full descriptions, and status where available, of the tests covered by this report, are linked here for convenience:
     132 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-1:RackReceiptandInventoryTest EG-ADM-1]: Rack Receipt and Inventory Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-1 status])
     133 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-2:RackAdministratorAccessTest EG-ADM-2]: Rack Administrator Access Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-2 status])
     134 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-3:FullRackRebootTest EG-ADM-3]: Full Rack Reboot Test
     135 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-4:EmergencyStopTest EG-ADM-4]: Emergency Stop Test
     136 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-5:SoftwareUpdateTest EG-ADM-5]: Software Update Test
     137 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-6:ControlNetworkDisconnectionTest EG-ADM-6]: Control Network Disconnection Test
     138 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-ADM-7:DocumentationReviewTest EG-ADM-7]: Documentation Review Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-ADM-7 status])
     139
     140 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-MON-1:ControlNetworkSoftwareandVLANInspectionTest EG-MON-1]: Control Network Software and VLAN Inspection Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-MON-1 status])
     141 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-MON-2:GENISoftwareConfigurationInspectionTest EG-MON-2]: GENI Software Configuration Inspection Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-MON-2 status])
     142 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-MON-3:GENIActiveExperimentInspectionTest EG-MON-3]: GENI Active Experiment Inspection Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-MON-3 status])
     143 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-MON-4:InfrastructureDevicePerformanceTest EG-MON-4]: Infrastructure Device Performance Test
     144 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-MON-5:GMOCDataCollectionTest EG-MON-5]: GMOC Data Collection Test
     145
     146 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-1:BareMetalSupportAcceptanceTest EG-EXP-1]:Bare Metal Support Acceptance Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-1 status])
     147 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-2:ExoGENISingleSiteAcceptanceTest EG-EXP-2]: ExoGENI Single Site Acceptance Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-3 status])
     148 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-3:ExoGENISingleSite100VMTest EG-EXP-3]: ExoGENI Single Site 100 VM Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-3 status])
     149 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-4:ExoGENIMulti-siteAcceptanceTest EG-EXP-4]:ExoGENI Multi-site Acceptance Test  ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-4 status])
     150 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-5:ExoGENIOpenFlowNetworkResourcesAcceptanceTest EG-EXP-5]: ExoGENI !OpenFlow Network Resources Acceptance Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-5 status])
     151 * [http://groups.geni.net/geni/wiki/GENIRacksHome/AcceptanceTests/ExogeniAcceptanceTestsPlan#EG-EXP-6:ExoGENIandMeso-scaleMulti-siteOpenFlowAcceptanceTest EG-EXP-6]: ExoGENI and Meso-scale Multi-site !OpenFlow Acceptance Test ([http://groups.geni.net/geni/wiki/GENIRacksHome/ExogeniRacks/AcceptanceTestStatus/EG-EXP-6 status])
     152
     153