wiki:GENIRacksHome/AcceptanceTests/OpenGENIAcceptanceTestsPlan

OpenGENI Acceptance Test Plan

This page captures the GENI Racks Acceptance Test Plan to be executed for the BBN GENI Rack Aggregate Manager (OpenGENI). This test plan is based on the GENI Racks Requirements and it outlines all features that are normally validated for GENI Racks. The goal of this effort is to capture the state of the current features and to generate a list of missing feature that are required to meet all GENI Racks requirements.

The BBN OpenGENI Acceptance Test effort will generate the following:

Assumptions and Dependencies

The following assumptions are made for all tests described in this plan:

  • OpenGENI Clearing house credentials will be used for all tests.
  • OpenGENI is the slice authority for all tests in this plan.
  • No tests using the GENI Clearinghouse and GPO ProtoGENI credentials are planned for the initial tests.
  • Resources for each test will be requested from the OpenGENI Aggregate Manager.
  • Compute resources are VMs unless otherwise stated, there are no dedicated devices available for this initial evaluation.
  • All Aggregate Manager requests are made via the Omni command line tool which uses the GENI AM API.
  • In all scenarios, one experiment is always equal to one slice.
  • Currently there is only one OpenGENI rack, which has the following test implications. All scenarios that are meant to be run on 1 rack will be run within 1 VM. All scenarios that are meant to have multiple racks, will use VMs on multiple servers.
  • OpenGENI will be used as the interface to the rack OpenFlow resources in the OpenFlow test cases.

It is expected that the OpenGENI Aggregate Manager will provide an interface into the (VLAN-based Multiplexed OpenFlow Controller) VMOC aggregate. If the OpenGENI interface to VMOC is not available, tests will be executed by submitting requests directly to VMOC.

If the OpenGENI solution does not provide support for experimenters uploading custom VM images to the rack, any test case using custom images will modified to use available images for the rack. The ability to upload a custom VM image to the OpenGENI rack will be tested when it becomes available.

Test Traffic Profile:

  • Experiment traffic includes UDP and TCP data streams at low rates to ensure end-to-end delivery
  • Traffic exchange is used to verify that the appropriate data paths are used and that traffic is delivered successfully for each test described.
  • Performance measurements are not a goal of these acceptance tests, but some samples will be collected with iperf to characterize the default performance, in some scenario described in this plan.

Acceptance Tests Descriptions

This section describes each acceptance test by defining its goals, topology, and outline test procedure. Test cases are listed by priority in sections below. The cases that verify the largest number of requirement criteria are typically listed at a higher priority. The prerequisite tests are usually executed first to verify that baseline monitoring and administrative functions are available. This allows the execution of the experimenter test cases. Additional monitoring and administrative tests described in later sections that are also run before the completion of the acceptance test effort.

For the OpenGENI Acceptance Test evaluation some of these administrative and monitoring features may not be available but the tests are still planned in order to capture the availability of expected features.

Administration Prerequisite Tests

Administrative Acceptance tests verify support of administrative management tasks and focus on verifying priority functions for each of the rack components. The set of administrative features described in this section are verified initially. Additional administrative tests are described in a later section and are executed before the acceptance test completion.

OG-ADM-1: Rack Receipt and Inventory Test

This acceptance test uses BBN as an example site because it requires physical access to the rack. The goal of this test is to verify that administrators can integrate the rack into a standard local procedures for systems hosted by the site.

Procedure

Outline:

  • Power and wire the BBN rack
  • Administrator configures the gramm.gpolab.bbn.com DNS namespace and 192.1.242.128/25 IP space, and enters all public IP addresses used by the rack into DNS.
  • Administrator requests and receives administrator accounts on the rack and receive read access to all OpenGENI monitoring of the rack.
  • Administrator inventories the physical rack contents, network connections and VLAN configuration, and power connectivity, using standard operational inventories.
  • Administrator, OpenGENI team, and GMOC share information about contact information and change control procedures, and OpenGENI operators subscribe to GENI operations mailing lists and submit their contact information to GMOC.
  • Administrator reviews the documented parts list, power requirements, physical and logical network connectivity requirements, and site administrator community requirements, verifying that these documents should be sufficient for a new site to use when setting up a rack.

OG-ADM-2: Rack Administrator Access Test

This test verifies local and remote administrative access to rack devices.

Procedure

Outline:

  1. For each type of rack infrastructure node, including VM server hosts and any VMs running infrastructure support services, use a site administrator account to test:
    • Login to the node using public-key SSH.
    • Verify that you cannot login to the node using password-based SSH, nor via any unencrypted login protocol.
    • When logged in, run a command via sudo to verify root privileges.
  2. For each rack infrastructure device (switches, remote PDUs if any), use a site administrator account to test:
    • Login via SSH.
    • Login via a serial console (if the device has one).
    • Verify that you cannot login to the device via an unencrypted login protocol.
    • Use the "enable" command or equivalent to verify privileged access.
  3. Verify that OpenGENI remote console solution for rack hosts can be used to access the consoles all server hosts and experimental hosts:
    • Login via SSH or other encrypted protocol.
    • Verify that you cannot login via an unencrypted login protocol.

Monitoring Rack Inspection Prerequisite Tests

These tests verify the availability of information needed to determine rack state, and needed to debug problems during experimental testing. Also verified is the ability to determine the rack components' test-readiness. Additional monitoring tests are defined in a later section to complete the validation in this section.

OG-MON-1: Control Network Software and VLAN Inspection Test

This test inspects the state of the rack control network, infrastructure nodes, and system software.

Procedure

  • A site administrator enumerates processes on each of the server host, the Control VM, the VMOC VM, etc. and an experimental node configured for OpenStack, which listen for network connections from other nodes, identifies what version of what software package is in use for each, and verifies that we know the source of each piece of software and could get access to its source code.
  • A site administrator reviews the configuration of the rack control plane switch and verifies that each experimental node's control and console access interfaces are on the expected VLANs.
  • A site administrator reviews the MAC address table on the control plane switch, and verifies that all entries are identifiable and expected.

OG-MON-2: GENI Software Configuration Inspection Test

This test inspects the state of the GENI AM software in use on the rack.

Procedure

  • A site administrator uses available system data sources (process listings, monitoring output, system logs, etc) and/or AM administrative interfaces to determine the configuration of OpenGENI resources:
    • How many experimental nodes are available for bare metal use, how many are configured as OpenStack containers, and how many are configured as PlanetLab containers.
    • What operating system each OpenStack container makes available for experimental VMs.
    • How many unbound VLANs are in the rack's available pool.
    • Whether the OpenGENI and OpenFlow AMs trust the pgeni.gpolab.bbn.com slice authority, which will be used for testing. Note, that the pgeni.gpolab.bbn.com slice authority is not used in this test, a local slice authority is used for the initial evaluation.
  • A site administrator uses available system data sources to determine the configuration of OpenFlow resources according to VMOC and OpenGENI.

OG-MON-3: GENI Active Experiment Inspection Test

This test inspects the state of the rack data plane and control networks when experiments are running, and verifies that a site administrator can find information about running experiments.

Procedure

  • An experimenter starts up experiments to ensure there is data to look at:
    • An experimenter runs an experiment containing at least one rack OpenStack VM, and terminates it.
    • An experimenter runs an experiment containing at least one rack OpenStack VM, and leaves it running.
  • A site administrator uses available system and experiment data sources to determine current experimental state, including:
    • How many VMs are running and which experimenters own them
    • How many physical hosts are in use by experiments, and which experimenters own them
    • How many VMs were terminated within the past day, and which experimenters owned them
    • What OpenFlow controllers the data plane switch and the rack VMOC are communicating with
  • A site administrator examines the switches and other rack data sources, and determines:
    • What MAC addresses are currently visible on the data plane switch and what experiments do they belong to?
    • For some experiment which was terminated within the past day, what data plane and control MAC and IP addresses did the experiment use?
    • For some experimental data path which is actively sending traffic on the data plane switch, do changes in interface counters show approximately the expected amount of traffic into and out of the switch?

Experimenter Acceptance Tests

For the OpenGENI Acceptance Test evaluation some of these topologies normally validated in GENI Racks are not possible. This effort has only one rack available. Each test case is described as originally intended, but additionally there are details to show how the test case is modified for the initial OpenGENI evaluation. Topologies not be available but the tests are still planned as intended in order to capture the availability of expected features.

OG-EXP-1: Bare Metal Support Acceptance Test

Bare metal nodes are exclusive dedicated physical nodes that are used throughout the experimenter test cases. This section outlines features to be verified which are not explicitly validated in other scenarios:

  1. Determine which nodes can be used as exclusive nodes.
  2. Obtain 2 licensed recent Microsoft OS images for physical nodes from the site (BBN).
  3. Reserve and boot 2 physical nodes using Microsoft image.
  4. Obtain a recent Linux OS image for physical nodes from the OpenGENI list.
  5. Reserve and boot a physical node using this Linux OS image.
  6. Release physical node resource.
  7. Modify Aggregate resource allocation for the rack to add 1 additional physical node.

Evaluation Note: 1) No Bare Metal Nodes are available. 2) There is MS Windows custom Linux support for the admin. 2)There is no custom image support for experimenters. 4) There is no way to modify a resource from VM to Bare Metal.

OG-EXP-2: OpenGENI Single Site Acceptance Test

This one site test is run on the BBN OpenGENI rack and it includes two experiments. Each experiment requests local compute resources, which generate bidirectional traffic over a Layer 2 data plane network connection. The goals of this test are to verify basic operations of VMs and data flows within one rack; verify the ability to request a publically routable IP address and public TCP/UDP port mapping for a control interface on a compute resource; and verify the ability to add a customized image for the rack.

Test Topology

This test uses this topology:

Note: The diagram shows the logical end-points for each experiment traffic exchange. The VMs may or may not be on different experiment nodes.

For the initial evaluation there are no bare metal nodes, so the test case is modified to have only VMs. Here is the actual topology run:

Evaluation Note: Test case is described for the original test case, actual procedure will be captures as part of test details available from the Acceptance Test Status page.

Prerequisites

This test has these prerequisites:

  • OpenGENI makes available at least two Linux distributions and a FreeBSD image. If the not available, test will be run with available images.
  • Two GPO customized Ubuntu image snapshots are available and have been manually uploaded by the rack administrator using available OpenGENI documentation. One Ubuntu image is for the VM and one Ubuntu image is for the physical node in this test. Physical node are not available, so VM will be used.
  • Traffic generation tools may be part of image or may be installed at experiment runtime.
  • Administrative accounts have been created for GPO staff on the BBN OpenGENI rack.
  • GENI Experimenter1 and Experimenter2 accounts exist at the GPO PG Clearinghouse.
  • If available, use baseline Monitoring to ensure that any problems are quickly identified.

Procedure

Do the following:

  1. As Experimenter1, request ListResources from BBN OpenGENI.
  2. Review advertisement RSpec for a list of OS images which can be loaded, and identify available resources.
  3. Verify that the GPO Ubuntu customized image is available in the advertisement RSpec.
  4. Define a request RSpec for two VMs, each with a GPO Ubuntu image. Request a publically routable IP address and public TCP/UDP port mapping for the control interface on each node.
  5. Create the first slice.
  6. Create a sliver in the first slice, using the RSpec defined in step 4.
  7. Log in to each of the systems, and send traffic to the other system sharing a VLAN.
  8. Using root privileges on one of the VMs load a Kernel module. If not supported on OpenStack nodes, testing will proceed past this step.
  9. Run a netcat listener and bind to port XYZ on each of the VMs in the BBN OpenGENI rack.
  10. Send traffic to port XYZ on each of the VMs in the OpenGENI rack over the control network from any commodity Internet host.
  11. As Experimenter2, request ListResources from Site2 OpenGENI.
  12. Define a request RSpec for two physical nodes, both using the uploaded GPO Ubuntu images. If not available, VMs and other images will be used.
  13. Create the second slice.
  14. Create a sliver in the second slice, using the RSpec defined in step 12.
  15. Log in to each of the systems, and send traffic to the other system.
  16. Verify that experimenters 1 and 2 cannot use the control plane to access each other's resources (e.g. via unauthenticated SSH, shared writable file system mount)
  17. Review system statistics and VM isolation and network isolation on data plane.
  18. Verify that each VM has a distinct MAC address for that interface.
  19. Verify that VMs' MAC addresses are learned on the data plane switch.
  20. Stop traffic and delete slivers.

OG-EXP-3: OpenGENI Single Site 100 VM Test

This one site test runs on the BBN OpenGENI rack and includes various scenarios to validate compute resource requirements for VMs. The goal of this test is not to validate the OpenGENI limits, but simply to verify that the OpenGENI rack can provide 100 VMs with its experiment nodes under various scenarios, including:

  • Scenario 5: 100 Slices with 1 VM each
  • Scenario 4: 50 Slices with 2 VMs each
  • Scenario 3: 4 Slices with 25 VMS each
  • Scenario 2: 2 Slices with 50 VMs each
  • Scenario 1: 1 Slice with 100 VMs

Scenarios will be executed in the order above. It is expected that it will not be possible to have one slice cannot support 100 VMs, tests will be run to determine the maximum number of VMs allowed in one slice and in multiple slices.

Test Topology

This test uses this topology:

Prerequisites

This test has these prerequisites:

  • Traffic generation tools may be part of image or installed at experiment runtime.
  • Administrative accounts exist for GPO staff on the BBN OpenGENI rack.
  • GENI Experimenter1 account exists at GPO PG Clearinghouse.
  • If available, baseline Monitoring is used to ensure that any problems are quickly identified.

Procedure

Do the following:

  1. As Experimenter1, request ListResources from BBN OpenGENI.
  2. Review ListResources output, and identify available resources.
  3. Write the Scenario 1 RSpec that requests 100 VMs evenly distributed across the experiment nodes using the default image.
  4. Create a slice.
  5. Create a sliver in the slice, using the RSpec defined in step 3.
  6. Log into several of the VMs, and send traffic to several other systems.
  7. Step up traffic rates to verify VMs continue to operate with realistic traffic loads.
  8. Review system statistics and VM isolation (does not include network isolation)
  9. Verify that several VMs running on the same experiment node have a distinct MAC address for their interface.
  10. Verify for several VMs running on the same experiment node, that their MAC addresses are learned on the data plane switch.
  11. Review monitoring statistics and check for resource status for CPU, disk, memory utilization, interface counters, uptime, process counts, and active user counts.
  12. Stop traffic and delete sliver.
  13. Re-execute the procedure described in steps 1-12 with changes required for Scenario 2 (2 Slices with 50 VMs each).
  14. Re-execute the procedure described in steps 1-12 with changes required for Scenario 3 (4 Slices with 25 VMS each).
  15. Re-execute the procedure described in steps 1-12 with changes required for Scenario 4 (50 Slices with 2 VMs each).
  16. Re-execute the procedure described in steps 1-12 with changes required for Scenario 5 (100 Slices with 1 VM each).

OG-EXP-4: OpenGENI Multi-site Acceptance Test

This test normally includes two sites and two experiments. Only one rack is available, so the test case will be modified to run within one rack, but VMs will be requested on separate servers. Each of the compute resources will exchange traffic. In addition, the VMsin Experiment2 will use multiple data interfaces. Normally, all site-to-site experiments take place over a wide-area Layer 2 data plane network connection via Internet2 or NLR using VLANs allocated by the AM, that is not the case for the initial test evaluation, all connections will be within the rack. The goal of this test is to verify basic operations of VMs and data flows between rack resources.

Test Topology

This test uses this topology:

For the initial evaluation there is only one rack, so the test case is modified to have VMs on different servers rather than different racks. Here is the actual topology run:

Evaluation Note: Test case is described for the original test case, actual procedure will be captures as part of test details available from the Acceptance Test Status page.

Prerequisites

This test has these prerequisites:

  • If available, BBN OpenGENI connectivity statistics will be monitored.
  • Administrative accounts have been created for GPO staff at the BBN OpenGENI rack.
  • The VLANs used will be allocated by the rack AM.
  • If available, baseline Monitoring is used to ensure that any problems are quickly identified.
  • OpenGENI manages private address allocation for the endpoints in this test.
  • Normal network aggregate requirement for the availability of the ION AM do not apply to current evaluation.

Procedure

Do the following:

  1. As Experimenter1, Request ListResources from BBN OpenGENI.
  2. Request ListResources for second OpenGENI AM (does not exist, thus skipping step).
  3. Review ListResources output from both AMs. (only one am used in this initial evaluation).
  4. Define a request RSpec for VMs at BBN OpenGENI to be on separate VM servers.
  5. Define a request RSpec for a VM at remote OpenGENI for an unbound exclusive non-OpenFlow VLAN to connect the 2 endpoints.
  6. Create the first slice.
  7. Create a sliver at each OpenGENI aggregate using the RSpecs defined above.
  8. Log in to each of the systems, and send traffic to the other system, leave traffic running.
  9. As Experimenter2, Request ListResources from BBN OpenGENI, (skipping second remote OpenGENI)
  10. Define an request RSpec for one VM and one bare metal node in the BBN OpenGENI rack. Each resource should have two logical interfaces and a 3rd VLAN for the local connection.
  11. Define a request RSpec to add two VMs at Site2 and two VLANs to connect the BBN OpenGENI to the Site2 OpenGENI. (Modified for one aggregate)
  12. Create a second slice.
  13. In the second slice, create a sliver at each OpenGENI aggregate using the RSpecs defined above. (Modified for one aggregate)
  14. Log in to each of the end-point systems, and send traffic to the other end-point system which shares the same VLAN.
  15. Verify traffic handling per experiment, VM isolation, and MAC address assignment.
  16. Construct and send a non-IP ethernet packet over the data plane interface. (pingplus tool will be used).
  17. Review baseline monitoring statistics.
  18. Run test for at least 1 hours.
  19. Review baseline monitoring statistics.
  20. Stop traffic and delete slivers.

OG-EXP-5: OpenGENI Network Resources Acceptance Test

A three site experiment where the only OpenGENI resources used are OpenFlow network resources. All compute resources are outside the OpenGENI rack. The experiment will use the OpenGENI Aggregate Manager to request the rack data plane resources. The OpenGENI AM configures the OpenGENI site OpenFlow switch. The goal of this test is to verify OpenFlow operations and integration with meso-scale compute resources and other compute resources external to the OpenGENI rack.

Test Topology

Note: The NLR and Internet2 OpenFlow VLANs are the GENI Network Core static VLANs.

For the initial evaluation there is only one rack, so the test case is modified to have VMs on different servers rather than different racks. Here is the actual topology run:

Prerequisites

  • A GPO site network is connected to the OpenGENI OpenFlow switch.
  • OpenGENI VMOC is running and can manage the OpenGENI OpenFlow switch
  • An OpenFlow controller is run by the experimenter and is accessible via DNS hostname (or IP address) and TCP port.
  • Two meso-scale remote sites make compute resources and OpenFlow meso-scale resources available for this test.
  • GMOC data collection for the meso-scale and OpenGENI rack resources is functioning for the OpenFlow and traffic measurements required in this test.

Evaluation Note: GMOC data collection is not available for the initial evaluation. Remote meso-scale sites are not possible for the initial evaluation and will be replaced by local rack nodes.

Procedure

The following operations are to be executed:

  1. As Experimenter1, Determine BBN compute resources and define RSpec.
  2. Determine remote meso-scale compute resources and define RSpec. (Modified for one aggregate and no meso-scale)
  3. Define a request RSpec for OpenFlow network resources at the BBN OpenGENI AM.
  4. Define a request RSpec for OpenFlow network resources at the remote I2 Meso-scale site. (Rack nodes will replace remote meso-scale.)
  5. Define a request RSpec for the OpenFlow Core resources
  6. Create the first slice
  7. Create a sliver for the BBN compute resources.
  8. Create a sliver at the I2 meso-scale site using VMOC at site. (Modified for one aggregate and no meso-scale)
  9. Create a sliver at of the BBN OpenGENI AM.
  10. Create a sliver for the OpenFlow resources in the core. (Modified for one aggregate and no meso-scale)
  11. Create a sliver for the meso-scale compute resources. (Modified for one aggregate and no meso-scale)
  12. Log in to each of the compute resources and send traffic to the other end-point.
  13. Verify that traffic is delivered to target.
  14. Review baseline, GMOC, and meso-scale monitoring statistics. (Not possible in current version.)
  15. As Experimenter2, determine BBN compute resources and define RSpec.
  16. Determine remote meso-scale compute resources and define RSpec.
  17. Define a request RSpec for OpenFlow network resources at the BBN OpenGENI AM.
  18. Define a request RSpec for OpenFlow network resources at the remote NLR Meso-scale site. (Rack nodes will replace remote meso-scale.)
  19. Define a request RSpec for the OpenFlow Core resources (No core resources will be used in initial evaluation)
  20. Create the second slice
  21. Create a sliver for the BBN compute resources.
  22. Create a sliver at the meso-scale site using FOAM at site.
  23. Create a sliver at of the BBN OpenGENI AM.
  24. Create a sliver for the OpenFlow resources in the core.
  25. Create a sliver for the meso-scale compute resources.
  26. Log in to each of the compute resources and send traffic to the other endpoint.
  27. As Experimenter2, insert flowmods and send packet-outs only for traffic assigned to the slivers.
  28. Verify that traffic is delivered to target according to the flowmods settings.
  29. Review baseline, GMOC, and monitoring statistics. (Not possible in current version.)
  30. Stop traffic and delete slivers.

OG-EXP-6: OpenGENI and Meso-scale Multi-site OpenFlow Acceptance Test

This test case normally includes three sites and three experiments, using resources in the BBN and Site2 OpenGENI racks as well as meso-scale resources, where the network resources are the core OpenFlow-controlled VLANs. Each of the compute resources will exchange traffic with the others in its slice, over a wide-area Layer 2 data plane network connection, using Internet2 and NLR VLANs. In particular, the following slices will be set up for this test:

  • Slice 1: One OpenGENI VM at each of BBN and Site2.
  • Slice 2: Two OpenGENI VMs at Site2 and one VM and one bare metal node at BBN.
  • Slice 3: An OpenGENI VM at BBN, a PG node at BBN, and a meso-scale Wide-Area ProtoGENI (WAPG) node.

The above topology will be requested within one rack.

Test Topology

This test uses this topology:

Note: The two Site2 VMs in Experiment2 must be on the same experiment node. This is not the case for other experiments.

For the initial evaluation there is only one rack, so the test case is modified to have VMs on different servers rather than different racks. Here is the actual topology run:

Evaluation Note: Test case is described for the original test case, actual procedure will be captures as part of test details available from the Acceptance Test Status page.

Prerequisites

This test has these prerequisites:

  • Meso-scale sites are available for testing
  • BBN OpenGENI connectivity statistics are monitored at the GPO OpenGENI Monitoring site.
  • GENI Experimenter1, Experimenter2 and Experimenter3 accounts exist.
  • This test will be scheduled at a time when site contacts are available to address any problems.
  • Both OpenGENI aggregates can link to static VLANs. (Modified for one aggregate)
  • Site's OpenFlow VLAN is implemented and is known for this test. (Use VMOC allocated OF VLANs)
  • If available, baseline Monitoring is in place at each site, to ensure that any problems are quickly identified.
  • GMOC data collection for the meso-scale and OpenGENI rack resources is functioning for the OpenFlow and traffic measurements required in this test.
  • An OpenFlow controller is run by the experimenter and is accessible via DNS hostname (or IP address) and TCP port.
  • a PG OpenFlow site is also added to the setup described in the diagram

Evaluation Note: There is no GMOC data colleciton and PG Site for initial OpenGENI evaluation.

Procedure

Do the following:

  1. As Experimenter1, request ListResources from BBN OpenGENI, Site2 OpenGENI, and from VMOC at I2 and NLR Site.
  2. Review ListResources output from all AMs.
  3. Define a request RSpec for a VM at the BBN OpenGENI.
  4. Define a request RSpec for a VM at the Site2 OpenGENI. (only one site used)
  5. Define request RSpecs for OpenFlow resources from BBN FOAM to access GENI OpenFlow core resources. (only one site used)
  6. Define request RSpecs for OpenFlow core resources at I2 FOAM (only one site used)
  7. Define request RSpecs for OpenFlow core resources at NLR FOAM. (only one site used)
  8. Create the first slice.
  9. Create a sliver in the first slice at each AM, using the RSpecs defined above.
  10. Log in to each of the systems, verify IP address assignment. Send traffic to the other system, leave traffic running.
  11. As Experimenter2, define a request RSpec for one VM and one physical node at BBN OpenGENI.
  12. Define a request RSpec for two VMs on the same experiment node at Site2 OpenGENI. (only one site used)
  13. Define request RSpecs for OpenFlow resources from BBN FOAM to access GENI OpenFlow core resources. (only one site used)
  14. Define request RSpecs for OpenFlow core resources at I2 FOAM. (only one site used)
  15. Define request RSpecs for OpenFlow core resources at NLR FOAM. (only one site used)
  16. Create a second slice.
  17. Create a sliver in the second slice at each AM, using the RSpecs defined above.
  18. Log in to each of the systems in the slice, and send traffic to each other systems; leave traffic running
  19. As Experimenter3, request ListResources from BBN OpenGENI, BBN meso-scale FOAM, and FOAM at Meso-scale Site (Internet2 Site BBN and NLR site). (only one site used)
  20. Review ListResources output from all AMs.
  21. Define a request RSpec for a VM at the BBN OpenGENI.
  22. Define a request RSpec for a compute resource at the BBN meso-scale site. (only one site used)
  23. Define a request RSpec for a compute resource at a meso-scale site. (only one site used)
  24. Define request RSpecs for OpenFlow resources to allow connection from OpenFlow BBN OpenGENI to Meso-scale OpenFlow sites(BBN and second site TBD) (I2 and NLR). (only one site used)
  25. If PG access to OpenFlow is available, define a request RSpec for the PG OpenFlow resource. (only one site used)
  26. Create a third slice.
  27. Create slivers that connects the Internet2 Meso-scale OpenFlow site to the BBN OpenGENI Site, and the BBN Meso-scale site; and if available, to PG node.
  28. Log in to each of the compute resources in the slice, configure data plane network interfaces on any non-OpenGENI resources as necessary, and send traffic to each other systems; leave traffic running.
  29. Verify that all three experiment continue to run without impacting each other's traffic, and that data is exchanged over the path along which data is supposed to flow.
  30. Review baseline monitoring statistics and checks.
  31. As site administrator, identify all controllers that the BBN OpenGENI OpenFlow switch is connected to.
  32. As Experimenter3, verify that traffic only flows on the network resources assigned to slivers as specified by the controller.
  33. Verify that no default controller, switch fail-open behavior, or other resource other than experimenters' controllers, can control how traffic flows on network resources assigned to experimenters' slivers.
  34. Set the hard and soft timeout of flowtable entries
  35. Get switch statistics and flowtable entries for slivers from the OpenFlow switch.
  36. Get layer 2 topology information about slivers in each slice.
  37. Install flows that match only on layer 2 fields, and confirm whether the matching is done in hardware.
  38. If supported, install flows that match only on layer 3 fields, and confirm whether the matching is done in hardware.
  39. Run test for at least 4 hours.
  40. Review monitoring statistics and checks as above.
  41. Delete slivers.

Documentation:

  1. Verify access to documentation about which OpenFlow actions can be performed in hardware.

OG-EXP-7: Click Router Experiment Acceptance Test

This test case uses a Click modular router experiment with OpenGENI VM nodes. The scenario uses 2 VMs as hosts and 4 VMs as Click Routers and is based on the following Click example experiment, although unlike the example, this test case uses VMs and it runs the Click router module in user space.

Test Topology

This test uses this topology:

Note: Two VMs will be requested on the same physical worker node at each rack site for the user-level Click Router .

For the initial evaluation there is only one rack, so the test case is modified to have VMs on different servers rather than different racks. Here is the actual topology run:

Evaluation Note: Test case is described for the original test case, actual procedure will be captures as part of test details available from the Acceptance Test Status page. The test case will be run within one rack.

Prerequisites

This test has these prerequisites:

  • TBD

Procedure

Do the following:

  1. As Experimenter1, request ListResources from BBN OpenGENI
  1. Review ListResources
  2. Define a request RSpec for six VMs at BBN OpenGENI
  3. Create slice
  4. Create a sliver
  5. Install Click router
  6. Determine Click router settings
  7. Run the user-level Click router
  8. Log in to Host1 and send traffic to host2
  9. Review Click logs on each Click router
  10. Delete slivers

Additional Administration Acceptance Tests

These tests will be performed as needed after the administration baseline tests complete successfully. For example, the Software Update Test will be performed at least once when the rack team provides new software for testing. We expect these tests to be interspersed with other tests in this plan at times that are agreeable to the GPO and the participants, not just run in a block at the end of testing. The goal of these tests is to verify that sites have adequate documentation, procedures, and tools to satisfy all GENI site requirements.

OG-ADM-3: Full Rack Reboot Test

In this test, a full rack reboot is performed as a drill of a procedure which a site administrator may need to perform for site maintenance.

Note: this test must be run using the BBN rack because it requires physical access.

Evaluation note: Can this be executed for the BBN OpenGENI rack?

Procedure

  1. Review relevant rack documentation about shutdown options and make a plan for the order in which to shutdown each component.
  2. Cleanly shutdown and/or hard-power-off all devices in the rack, and verify that everything in the rack is powered down.
  3. Power on all devices, bring all logical components back online, and use monitoring and comprehensive health tests to verify that the rack is healthy again.

OG-ADM-4: Emergency Stop Test

In this test, an Emergency Stop drill is performed on a sliver in the rack.

Prerequisites

  • GMOC's updated Emergency Stop procedure is approved and published on a public wiki.
  • OpenGENI's procedure for performing a shutdown operation on any type of sliver in an OpenGENI rack is published on a public wiki or on a protected wiki that all OpenGENI site administrators (including GPO) can access.
  • An Emergency Stop test is scheduled at a convenient time for all participants and documented in GMOC ticket(s).
  • A test experiment is running that involves a slice with connections to at least one OpenGENI rack compute resource.

Evaluation note: Emergency stop is not expected to be supported for the initial evaluation.

Procedure

  • A site administrator reviews the Emergency Stop and sliver shutdown procedures, and verifies that these two documents combined fully document the campus side of the Emergency Stop procedure.
  • A second administrator (or the GPO) submits an Emergency Stop request to GMOC, referencing activity from a public IP address assigned to a compute sliver in the rack that is part of the test experiment.
  • GMOC and the first site administrator perform an Emergency Stop drill in which the site administrator successfully shuts down the sliver in coordination with GMOC.
  • GMOC completes the Emergency Stop workflow, including updating/closing GMOC tickets.

OG-ADM-5: Software Update Test

In this test, we update software on the rack as a test of the software update procedure.

Prerequisites

Minor updates of system packages for all infrastructure OSes, OpenGENI local AM software, and VMOC are available to be installed on the rack. This test may need to be scheduled to take advantage of a time when these updates are available.

Procedure

  • A BBN site administrator reviews the procedure for performing software updates of GENI and non-GENI software on the rack. If there is a procedure for updating any version tracking documentation (e.g. a wiki page) or checking any version tracking tools, the administrator reviews that as well.
  • Following that procedure, the administrator performs minor software updates on rack components, including as many as possible of the following (depending on availability of updates):
    • At least one update of a standard (non-GENI) package on each of the control and compute node. (GPO will look for a package which has a security vulnerability listed in the portaudit database.)
    • At least one update of a standard (non-GENI) system package on the VMOC VM.
    • At least one update of a standard (non-GENI) system package on the VM server host OS.
    • An update of OpenGENI local AM software on control node.
    • An update of VMOC software
  • The admin confirms that the software updates completed successfully
  • The admin updates any appropriate version tracking documentation or runs appropriate tool checks indicated by the version tracking procedure.

OG-ADM-6: Control Network Disconnection Test

In this test, we disconnect parts of the rack control network or its dependencies to test partial rack functionality in an outage situation.

Note: this test must be performed on the BBN rack because GPO will modify configuration on the control plane router and switch upstream from the rack in order to perform the test.

Procedure

  • Simulate an outage of ???? by inserting a firewall rule on the BBN router blocking the rack from reaching it. Verify that an administrator can still access the rack, that rack monitoring to GMOC continues through the outage, and that some experimenter operations still succeed.
  • Simulate an outage of each of the rack server host and control plane switch by disabling their respective interfaces on the BBN's control network switch. Verify that GPO, OpenGENI, and GMOC monitoring all see the outage.

Evaluation Note: The simulated outage does not apply to initial evaluation, there will be no monitoring by GMOC. Also there is no OpenGENI SNMP polling.

OG-ADM-7: Documentation Review Test

Although this is not a single test per-se, this section lists required documents that the rack teams will write. Draft documents should be delivered prior to testing of the functional areas to which they apply. Final documents must be deliveredto be made available for non-developer sites. Final documents will be public, unless there is some specific reason a particular document cannot be public (e.g. a security concern from a GENI rack site).

Procedure

Review each required document listed below, and verify that:

  • The document has been provided in a public location (e.g. the GENI wiki, or any other public website)
  • The document contains the required information.
  • The documented information appears to be accurate.

Note: this tests only the documentation, not the rack behavior which is documented. Rack behavior related to any or all of these documents may be tested elsewhere in this plan.

Documents to review:

  • Pre-installation document that lists specific minimum requirements for all site-provided services for potential rack sites (e.g. space, number and type of power plugs, number and type of power circuits, cooling load, public addresses, NLR or Internet2 layer2 connections, etc.). This document should also list all standard expected rack interfaces (e.g. 10GBE links to at least one research network).
  • Summary GENI rack parts list, including vendor part numbers for "standard" equipment intended for all sites (e.g. a VM server) and per-site equipment options (e.g. transceivers, PDUs etc.), if any. This document should also indicate approximately how much headroom, if any, remains in the standard rack PDUs' power budget to support other equipment that sites may add to the rack.
  • Procedure for identifying the software versions and system file configurations running on a rack, and how to get information about recent changes to the rack software and configuration.
  • Explanation of how and when software and OS updates can be performed on a rack, including plans for notification and update if important security vulnerabilities in rack software are discovered.
  • Description of the GENI software running on a standard rack, and explanation of how to get access to the source code of each piece of standard GENI software.
  • Description of all the GENI experimental resources within the rack, and what policy options exist for each, including: how to configure rack nodes as bare metal vs. VM server, what options exist for configuring automated approval of compute and network resource requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, and whether it is possible to trust local users within the rack.
  • Description of the expected state of all the GENI experimental resources in the rack, including how to determine the state of an experimental resource and what state is expected for an unallocated bare metal node.
  • Procedure for creating new site administrator and operator accounts.
  • Procedure for changing IP addresses for all rack components.
  • Procedure for cleanly shutting down an entire rack in case of a scheduled site outage.
  • Procedure for performing a shutdown operation on any type of sliver on a rack, in support of an Emergency Stop request.
  • Procedure for performing comprehensive health checks for a rack (or, if those health checks are being run automatically, how to view the current/recent results).
  • Technical plan for handing off primary rack operations to site operators at all sites.
  • Per-site documentation. This documentation should be prepared before sites are installed and kept updated after installation to reflect any changes or upgrades after delivery. Text, network diagrams, wiring diagrams and labeled photos are all acceptable for site documents. Per-site documentation should include the following items for each site:
    1. Part numbers and quantities of PDUs, with NEMA input power connector types, and an inventory of which equipment connects to which PDU.
    2. Physical network interfaces for each control and data plane port that connects to the site's existing network(s), including type, part numbers, maximum speed etc. (eg. 10-GB-SR fiber)
    3. Public IP addresses allocated to the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network.
    4. Data plane network connectivity and procedures for each rack, including core backbone connectivity and documentation, switch configuration options to set for compatibility with the L2 core, and the site and rack procedures for connecting non-rack-controlled VLANs and resources to the rack data plane. A network diagram is highly recommended (See existing OpenFlow meso-scale network diagrams on the GENI wiki for examples.)

Additional Monitoring Acceptance Tests

These tests will be performed as needed after the monitoring baseline tests complete successfully. For example, the GMOC data collection test will be performed during the OpenGENI Network Resources Acceptance test, where we already use the GMOC for meso-scale OpenFlow monitoring. We expect these tests to be interspersed with other tests in this plan at times that are agreeable to the GPO and the participants, not just run in a block at the end of testing. The goal of these tests is to verify that sites have adequate tools to view and share GENI rack data that satisfies all GENI monitoring requirements.

OG-MON-4: Infrastructure Device Performance Test

This test verifies that the rack head node performs well enough to run all the services it needs to run.

Procedure

While experiments involving OpenGENI-controlled OpenFlow slivers and compute slivers are running:

  • View OpenFlow control monitoring at GMOC and verify that no monitoring data is missing
  • View VLAN 1750 data plane monitoring, which pings the rack's interface on VLAN 1750, and verify that packets are not being dropped
  • Verify that the CPU idle percentage on the server host and the OpenFlow Controller VMs are both nonzero.

Evaluation note: There will no GMOC monitoring, but system data will be gathered for the Infrastructure hosts".

OG-MON-5: GMOC Data Collection Test

This test verifies the rack's submission of monitoring data to GMOC.

Evaluation note: There will no GMOC monitoring.

Procedure

View the dataset collected at GMOC for the BBN and Site2 OpenGENI racks. For each piece of required data, attempt to verify that:

  • The data is being collected and accepted by GMOC and can be viewed at gmoc-db.grnoc.iu.edu
  • The data's "site" tag indicates that it is being reported for the OpenGENI rack located at the gpolab or OpenGENI site2 site (as appropriate for that rack).
  • The data has been reported within the past 10 minutes.
  • For each piece of data, either verify that it is being collected at least once a minute, or verify that it requires more complicated processing than a simple file read to collect, and thus can be collected less often.

Verify that the following pieces of data are being reported:

  • Is each of the rack OpenGENI and VMOC AMs reachable via the GENI AM API right now?
  • Is each compute or unbound VLAN resource at each rack AM online? Is it available or in use?
  • Sliver count and percentage of rack compute and unbound VLAN resources in use.
  • Identities of current slivers on each rack AM, including creation time for each.
  • Per-sliver interface counters for compute and VLAN resources (where these values can be easily collected).
  • Is the rack data plane switch online?
  • Interface counters and VLAN memberships for each rack data plane switch interface
  • MAC address table contents for shared VLANs which appear on rack data plane switches
  • Is each rack experimental node online?
  • For each rack experimental node configured as an OpenStack VM server, overall CPU, disk, and memory utilization for the host, current VM count and total VM capacity of the host.
  • For each rack experimental node configured as an OpenStack VM server, interface counters for each data plane interface.
  • Results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack.

Verify that per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on the rack, either by providing raw sliver data containing sliver users to GMOC, or by collecting data locally and producing trending summaries on demand.

Test Methodology and Reporting

Test Case Execution

  1. All test procedure steps will be executed until there is a blocking issue.
  2. If a blocking issue is found for a test case, testing will be stopped for that test case.
  3. Testing focus will shift to another test case while waiting for a solution to a blocking issue.
  4. If a non-blocking issue is found, testing will continue toward completion of the procedure.
  5. When a software resolution or workaround is available for a blocking issue, the test impacted by the issue is re-executed until it can be completed successfully.
  6. Supporting documentation will be used whenever available.
  7. Questions that were not answered by existing documentation are to be gathered during the acceptance testing and published, as we did for the rack design.

Issue Tracking

  1. All issues discovered in acceptance testing regardless of priority are to be tracked in a bug tracking system.
  2. The bug tracking system to be used is the Internal GRAM trac using the "test" component.
  3. All types of issues encountered (documentation error, software bug, missing features, missing documentation, etc.) are to tracked.
  4. All unresolved issues will be reviewed and published at the end of the acceptance test as part of the acceptance test report.

Status Updates and Reporting

  1. A periodic status update will be generated, as the acceptance test plan is being executed, or as needed.
  2. Periodic (once per-day) status update will be posted to the rack team mail list (gram-dev@bbn.com).
  3. Upon acceptance test completion, all findings and unresolved issue will be captured in an Acceptance Test Report.
  4. Supporting configuration and RSpecs used in testing will be part of the acceptance test report or checked into a specified repository.

Test Case Naming

The test case in this plan follow a naming convention that uses OG-XXX-Y where GR is OpenGENI and XXX may equal any of the following: ADM for Administrative or EXP for Experimenter or MON for Monitoring. The final component of the test case name is the Y, which is the test case number.

Requirements Validation

This acceptance test plan verifies Integration (C), Monitoring (D), Experimenter (G) and Local Aggregate (F) requirements. As part of the test planing process, the GPO Infrastructure group mapped each of the GENI Racks Requirements to a set of validation criteria. For a detailed look at the validation criteria see the GENI Racks Acceptance Criteria page.

This plan does not validate any Software (B) requirements, as they are validated by the GPO Software team's GENI AM API Acceptance tests suite.

Some requirements are not verified in this test plan:

  • C.2.a "Support at least 100 simultaneous active (e.g. actually passing data) layer 2 Ethernet VLAN connections to the rack. For this purpose, VLAN paths must terminate on separate rack VMs, not on the rack switch."
    • Production Aggregate Requirements (E)

Glossary

Following is a glossary for terminology used in this plan, for additional terminology definition see the GENI Glossary page.

  • People:
    • Experimenter: A person accessing the rack using a GENI credential and the GENI AM API.
    • Administrator: A person who has fully-privileged access to, and responsibility for, the rack infrastructure (servers, network devices, etc) at a given location.
    • Operator: A person who has unprivileged/partially-privileged access to the rack infrastructure at a given location, and has responsibility for one or a few particular functions.
  • Baseline Monitoring: Set of monitoring functions which show aggregate health for VMs and switches and their interface status, traffic counts for interfaces and VLANs. Includes resource availability and utilization.
  • Experimental compute resources:
    • VM: An experimental compute resource which is a virtual machine located on a physical machine in the rack.
    • Bare metal Node: An experimental exclusive compute resource which is a physical machine usable by experimenters without virtualization.
    • Compute Resource: Either a VM or a bare metal node.
  • Experimental compute resource components:
    • logical interface: A network interface seen by a compute resource (e.g. a distinct listing in ifconfig output). May be provided by a physical interface, or by virtualization of an interface.
  • Experimental network resources:
    • VLAN: A data plane VLAN, which may or may not be OpenFlow-controlled.
    • Bound VLAN: A VLAN which an experimenter requests by specifying the desired VLAN ID. (If the aggregate is unable to provide access to that numbered VLAN or to another VLAN which is bridged to the numbered VLAN, the experimenter's request will fail.)
    • Unbound VLAN: A VLAN which an experimenter requests without specifying a VLAN ID. (The aggregate may provide any available VLAN to the experimenter.)
  • Exclusive VLAN: A VLAN which is provided for the exclusive use of one experimenter.
  • Shared VLAN: A VLAN which is shared among multiple experimenters.

We make the following assumptions about experimental network resources:

  • Unbound VLANs are always exclusive.
  • Bound VLANs may be either exclusive or shared, and this is determined on a per-VLAN basis and configured by operators.
  • Shared VLANs are always OpenFlow-controlled, with OpenFlow providing the slicing between experimenters who have access to the VLAN.
  • If a VLAN provides an end-to-end path between multiple aggregates or organizations, it is considered "shared" if it is shared anywhere along its length --- even if only one experimenter can access the VLAN at some particular aggregate or organization (for whatever reason), a VLAN which is shared anywhere along its L2 path is called "shared".

Email help@geni.net for GENI support or email me with feedback on this page!
Last modified 10 years ago Last modified on 05/23/14 15:00:46

Attachments (11)

Download all attachments as: .zip