wiki:GENIRacksProjects/AcceptanceTestsCriteria

Version 13 (modified by Vic Thomas, 6 years ago) (diff)

--

GENI Racks Acceptance Criteria

This page captures the details for the GPO infrastructure group mapping of each GENI Rack Requirements to a set of validation criteria used in the acceptance testing to verify the implementation of the rack features. This page describes the GENI Racks Acceptance Tests Criteria and provides mapping between the Requirements, Acceptance Criteria, and Test Cases. Acceptance criteria are intended to be common to all rack types (ExoGENi, InstaGENI), but any exceptions are documented in the project's Acceptance Test Plan. Each criterion in this page is linked to its related requirement on GeniRacks. This page is a snapshot of the GPO's work provided for reference, and is not expected to be revised along with the test plan. Questions about current acceptance tests and criteria should use the appropriate project mailing list for the racks under test (exogeni-design@geni.net or instageni-design@geni.net)

This page does not include criteria for software acceptance, a separate effort by the GPO Software team. The GENI API Acceptance tests suite verifies software requirements.

Requirements Mapping

These sections provides various mapping to facilitate tracing from requirements to the test cases in the Acceptance Test Plans.

ExoGENI Test Case Mapping to Acceptance Criteria Mapping

Test case mappings to acceptance criteria are specific to the rack project. As more Acceptance Test Plans are added, the acceptance criteria remain common, but the mapping to test cases may differ. Project specific test case mappings will be captured in this section.

Administration Acceptance Test Requirements Mapping

EG-ADM-1: Rack Receipt and Inventory Test

  • VI.02. A public document contains a parts list for each rack. (F.1)
  • VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
  • VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
  • VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
  • VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
  • VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with Legal, Law Enforcement and Regulatory(LLR) Plan, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
  • VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
  • VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
  • VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)

EG-ADM-2: Rack Administrator Access Test

  • V.01. For all rack infrastructure Unix hosts, including rack servers (both physical and VM) and experimental VM servers, site administrators should be able to login at a console (physical or virtual). (C.3.a)
  • V.02. Site administrators can login to all rack infrastructure Unix hosts using public-key SSH. (C.3.a, C.3.b)
  • V.03. Site administrators cannot login to any rack infrastructure Unix hosts using password-based SSH, nor via any unencrypted login protocol. (C.3.a)
  • V.04. Site administrators can run any command with root privileges on all rack infrastructure Unix hosts. (C.3.a)
  • V.05. Site administrators can login to all network-accessible rack infrastructure devices (network devices, remote KVMs, remote PDUs, etc) via serial console and via SSH. (C.3.a, C.3.b)
  • V.06. Site administrators cannot login to any network-accessible rack device via an unencrypted login protocol. (C.3.a)

Documentation Review Test

  • VI.01. The rack development team has documented a technical plan for handing off primary rack operations to site operators. (E)
  • VI.02. A public document contains a parts list for each rack. (F.1)
  • VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
  • VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
  • VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
  • VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
  • VI.09. A public document explains how to identify the software versions and system file configurations running on the rack, and how to get information about recent changes to the rack software and configuration. (F.5)
  • VI.10. A public document explains how and when software and OS updates can be performed on the rack, including plans for notification and update if important security vulnerabilities in rack software are discovered. (F.5)
  • VI.11. A public document describes the GENI software running on the rack, and explains how to get access to the source code of each piece of GENI software. (F.6)
  • VI.12. A public document describes all the GENI experimental resources within the rack, and explains what policy options exist for each, including: how to configure rack nodes as bare metal vs. VM server, what options exist for configuring automated approval of compute and network resource requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, whether it is possible to trust local users within the rack. (F.7)
  • VI.13. A public document describes the expected state of all the GENI experimental resources in the rack, including how to determine the state of an experimental resource and what state is expected for an unallocated bare metal node. (F.5)
  • VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
  • VI.15. A procedure is documented for changing IP addresses for all rack components. (C.3.e)
  • VI.16. A procedure is documented for cleanly shutting down the entire rack in case of a scheduled site outage. (C.3.c)
  • VI.17. A procedure is documented for performing a shutdown operation on any type of sliver on the rack, in support of an Emergency Stop request. (C.3.d)
  • VII.16. A public document explains how to perform comprehensive health checks for a rack (or, if those health checks are being run automatically, how to view the current/recent results). (F.8)

EG-ADM-3: Full Rack Reboot Test

  • IV.01. All experimental hosts are configured to boot (rather than stay off pending manual intervention) when they are cleanly shut down and then remotely power-cycled. (C.3.c)
  • V.10. Site administrators can authenticate remotely and power on, power off, or power-cycle, all physical rack devices, including experimental hosts, servers, and network devices. (C.3.c)
  • V.11. Site administrators can authenticate remotely and virtually power on, power off, or power-cycle all virtual rack resources, including server and experimental VMs. (C.3.c)
  • VI.16. A procedure is documented for cleanly shutting down the entire rack in case of a scheduled site outage. (C.3.c)
  • VII.16. A public document explains how to perform comprehensive health checks for a rack (or, if those health checks are being run automatically, how to view the current/recent results). (F.8)

EG-ADM-4: Emergency Stop Test

  • VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with Legal, Law Enforcement and Regulatory(LLR) Plan, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
  • VI.17. A procedure is documented for performing a shutdown operation on any type of sliver on the rack, in support of an Emergency Stop request. (C.3.d)
  • VII.18. Given a public IP address and port, an exclusive VLAN, a sliver name, or a piece of user-identifying information such as e-mail address or username, a site administrator or GMOC operator can identify the email address, username, and affiliation of the experimenter who controlled that resource at a particular time. (D.7)
  • VII.19. GMOC and a site administrator can perform a successful Emergency Stop drill in which slivers containing compute and OpenFlow-controlled network resources are shut down. (C.3.d)

EG-ADM-5: Software Update Test

  • VII.07. A site administrator can perform software and OS updates on the rack. (F.5)

EG-ADM-6: Control Network Disconnection Test

  • V.09. When the rack control network is partially down or the rack vendor's home site is inaccessible from the rack, it is still possible to access the primary control network device and server for recovery. All devices/networks which must be operational in order for the control network switch and primary server to be reachable, are documented. (C.3.b)
  • VII.14. A site administrator can locate information about the network reachability of all rack infrastructure which should live on the control network, and can get alerts when any rack infrastructure control IP becomes unavailable from the rack server host, or when the rack server host cannot reach the commodity internet. (D.6.c)

Monitoring Acceptance Test Requirements Mapping

EG-MON-1: Control Network Software and VLAN Inspection Test

  • VI.09. A public document explains how to identify the software versions and system file configurations running on the rack, and how to get information about recent changes to the rack software and configuration. (F.5)
  • VI.11. A public document describes the GENI software running on the rack, and explains how to get access to the source code of each piece of GENI software. (F.6)
  • VII.03. Site administrators can understand the expected control and dataplane network behavior of their rack. (F.2)
  • VII.04. Site administrators can view and investigate current system and network activity on their rack. (F.2)
  • VII.06. A site administrator can verify the control software and configurations on the rack at some point in time. (F.5)
  • VII.08. A site administrator can get access to source code for the version of each piece of GENI code installed on their site rack at some point in time. (F.6)
  • VII.09. A site administrator can determine the MAC addresses of all physical host interfaces, all network device interfaces, all active experimental VMs, and all recently-terminated experimental VMs. (C.3.f)
  • VII.10. A site administrator can locate current and recent CPU and memory utilization for each rack network device, and can find recent changes or errors in a log. (D.6.a)
  • VII.12. For each infrastructure and experimental host, a site administrator can locate current and recent uptime, CPU, disk, and memory utilization, interface traffic counters, process counts, and active user counts. (D.6.b)
  • VII.13. A site administrator can locate recent syslogs for all infrastructure and experimental hosts. (D.6.b)

EG-MON-2: GENI Software Configuration Inspection Test

  • VI.12. A public document describes all the GENI experimental resources within the rack, and explains what policy options exist for each, including: how to configure rack nodes as bare metal vs. VM server, what options exist for configuring automated approval of compute and network resource requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, whether it is possible to trust local users within the rack. (F.7)
  • VI.13. A public document describes the expected state of all the GENI experimental resources in the rack, including how to determine the state of an experimental resource and what state is expected for an unallocated bare metal node. (F.5)
  • VII.11. A site administrator can locate current configuration of flowvisor, FOAM, and any other OpenFlow services, and find logs of recent activity and changes. (D.6.a)

EG-MON-3: GENI Active Experiment Inspection Test

  • VII.09. A site administrator can determine the MAC addresses of all physical host interfaces, all network device interfaces, all active experimental VMs, and all recently-terminated experimental VMs. (C.3.f)
  • VII.11. A site administrator can locate current configuration of flowvisor, FOAM, and any other OpenFlow services, and find logs of recent activity and changes. (D.6.a)
  • VII.18. Given a public IP address and port, an exclusive VLAN, a sliver name, or a piece of user-identifying information such as e-mail address or username, a site administrator or GMOC operator can identify the email address, username, and affiliation of the experimenter who controlled that resource at a particular time. (D.7)

EG-MON-4: Infrastructure Device Performance Test

  • IV.04. When all aggregates and services are running on the primary rack server, the host's performance is good enough that OpenFlow monitoring does not lose data (due to an overloaded FOAM or FlowVisor) and does not report visible dataplane problems (due to an overloaded FlowVisor). (open questions ExoGENI B-6, InstaGENI B.4)
  • IV.02. Mesoscale reachability testing can report on the recent liveness of the rack bound VLANs by pinging a per-rack IP in each mesoscale monitoring subnet. (D.8)

EG-MON-5: GMOC Data Collection Test

  • VIII.01. Operational monitoring data for the rack is available at gmoc-db.grnoc.iu.edu. (D.2)
  • VIII.02. The rack data's "site" tag in the GMOC database indicates the physical location (e.g. host campus) of the rack. (D.2)
  • VIII.03. Whenever the rack is operational, GMOC's database contains site monitoring data which is at most 10 minutes old. (D.3)
  • VIII.04. Any site variable which can be collected by reading a counter (i.e. which does not require system or network processing beyond a file read) is collected by local rack monitoring at least once a minute. (D.3)
  • VIII.05. All hosts which submit data to gmoc-db have system clocks which agree with gmoc-db's clock to within 45 seconds. (GMOC is responsible for ensuring that gmoc-db's own clock is synchronized to an accurate time source.) (D.4)
  • VIII.06. The GMOC database contains data about whether each site AM has recently been reachable via the GENI AM API. (D.5.a)
  • VIII.07. The GMOC database contains data about the recent uptime and availability of each compute or unbound VLAN resource at each rack AM. (D.5.a)
  • VIII.08. The GMOC database contains the sliver count and percentage of resources in use at each rack AM. (D.5.a)
  • VIII.09. The GMOC database contains the creation time of each sliver on each rack AM. (D.5.a)
  • VIII.10. If possible, the GMOC database contains per-sliver interface counters for each rack AM. (D.5.a)
  • VIII.11. The GMOC database contains data about whether each rack dataplane switch has recently been online. (D.5.b)
  • VIII.12. The GMOC database contains recent traffic counters and VLAN memberships for each rack dataplane switch interface. (D.5.b)
  • VIII.13. The GMOC database contains recent MAC address table contents for shared VLANs which appear on rack dataplane switches (D.5.b)
  • VIII.14. The GMOC database contains data about whether each experimental VM server has recently been online. (D.5.c)
  • VIII.15. The GMOC database contain overall CPU, disk, and memory utilization, and VM count and capacity, for each experimental VM server. (D.5.c)
  • VIII.16. The GMOC database contains overall interface counters for experimental VM server dataplane interfaces. (D.5.c)
  • VIII.17. The GMOC database contains recent results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack. (D.5.d)
  • VIII.18. For trending purposes, per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on a given rack. Racks may provide raw sliver/user data to GMOC, or may produce their own trending summaries on demand. (D.7)

Experimenter Acceptance Test Requirements Mapping

EG-EXP-1: Bare Metal Support Acceptance Test

  • III.01. A recent Linux OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.b)
  • III.02. A recent Microsoft OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.d)

EG-EXP-2: ExoGENI Single Site Acceptance Test

  • I.02. If two experimenters have compute resources on the same rack, they cannot use the control plane to access each other's resources (e.g. via unauthenticated SSH, shared writable filesystem mount) (C.1.a)
  • I.03. If two experimenters reserve exclusive VLANs on the same rack, they cannot see or modify each other's dataplane traffic on those exclusive VLANs. (C.1.a)
  • I.04. An experimenter can create and access a sliver containing a bare-metal node on the rack. (C.1.b)
  • I.21. An experimenter can reserve a VM on the rack, and can run any command with root privileges within the VM OS, including loading a kernel module in any VM OS with a modular kernel. (G.1)
  • III.04. An experimenter can view a list of OS images which can be loaded on VMs by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
  • III.05. An experimenter can view a list of OS images which can be loaded on bare-metal nodes by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
  • III.06. The procedure for an experimenter to have new VM images added to the rack has been documented, including any restrictions on what OS images can be run on VMs. (C.1.e)
  • III.07. The procedure for an experimenter to have new bare-metal images added to the rack has been documented, including any restrictions on what OS images can be run on bare-metal nodes. (C.1.e)

EG-EXP-3: ExoGENI Single Site Limits Test

  • I.01. 100 experimental VMs can run on the rack simultaneously (C.1.a)
  • I.18. If multiple VMs use the same physical interface to carry dataplane traffic, each VM has a distinct MAC address for that interface. (C.2.d, G.4)
  • III.03. A recent Linux OS image for VMs is provided by the rack team, and an experimenter can reserve and boot a VM using this image. (C.1.e)

EG-EXP-4: ExoGENI Multi-site Acceptance Test

  • I.05. An experimenter can create and access a sliver containing both a VM and a bare-metal node. (C.1.c)
  • I.06. An experimenter can create and access a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN. (C.1.c)
  • I.07. If an experimenter creates a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN, the nodes can communicate on that VLAN. (C.1.c)
  • I.20. An experimenter can allocate and run a slice in which a compute resource in the rack sends traffic via an unbound exclusive rack VLAN, traversing a dynamically-created topology in a core L2 network, to another rack's dataplane. (C.2.e)
  • I.22. An experimenter can create and access an experiment which configures a bare-metal compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.23. An experimenter can create and access an experiment which configures a VM compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.24. An experimenter can create and access an experiment containing a compute resource with a dataplane interface, and can construct and send a non-IP ethernet packet over the dataplane interface. (G.3)

EG-EXP-5: ExoGENI OpenFlow Network Resources Acceptance Test

  • II.04. An experimenter can create a sliver including a subset of the traffic of a bound shared VLAN, define that subset using flowspace rules, and have that subset controlled by a controller that they run. (C.2.f)
  • II.08. If an experimenter creates an OpenFlow sliver on a shared VLAN, the experimenter's controller receives OpenFlow control requests only for traffic assigned to their sliver, and can successfully insert flowmods or send packet-outs only for traffic assigned to their sliver. (C.2.f)

EG-EXP-6: ExoGENI and Meso-scale Multi-site OpenFlow Acceptance Test

  • I.10. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a bound VLAN. (C.2.b, C.2.e)
  • I.11. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a shared VLAN, and can specify what IP address should be assigned to the dataplane interface connected to that shared VLAN. (C.2.b)
  • I.12. Two experimenters can create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound shared VLAN. (C.2.b, C.2.e)
  • I.15. An experimenter can create and access a sliver containing multiple compute resources in the rack with a dataplane interface on each connected to the same bound VLAN. (C.2.b)
  • I.17. An experimenter can create and access a sliver containing at least one bare-metal node and at least two VMs in the rack, and each compute resource must be allowed to have a dataplane interface on a single bound VLAN. (C.2.c)
  • II.01. If multiple VMs use the same physical interface to carry dataplane traffic, traffic between the VM dataplane interfaces can be OpenFlow-controlled. (C.2.d) (G.4)
  • II.05. An experimenter can run a controller (for any type of OpenFlow sliver) on an arbitrary system, accessible via the public Internet or via a private network which the rack FlowVisor can access, by specifying the DNS hostname (or IP address) and TCP port, as part of their sliver request. (C.2.f)
  • II.07. If an experimenter creates a sliver containing a compute resource with a dataplane interface on a shared VLAN, only the subset of traffic on the VLAN which has been assigned to their sliver is visible to the dataplane interface of their compute resource. (C.2.f)
  • II.09. Traffic only flows on the network resources assigned to experimenters' slivers as specified by the experimenters' controllers. No default controller, switch fail-open behavior, or other resource other than experimenters' controllers, can control how traffic flows on network resources assigned to experimenters' slivers. (C.2.f)
  • II.10. An experimenter can set the hard and soft timeout of flowtable entries that their controller adds to the switch. (C.2.f)
  • II.11. An experimenter's controller can get switch statistics and flowtable entries for their sliver from the switch. (C.2.f)
  • II.12. An experimenter's controller can get layer 2 topology information about their sliver, and about other slivers in their slice. (C.2.f)
  • II.13. An experimenter can access documentation about which OpenFlow actions can be performed in hardware. (C.2.f)
  • II.14. An experimenter can install flows that match only on layer 2 fields, and confirm whether the matching is done in hardware. (C.2.f)
  • II.15. If supported by the rack, an experimenter can install flows that match only on layer 3 fields, and confirm whether the matching is done in hardware. (C.2.f)

InstaGENI Test Case Mapping to Acceptance Criteria Mapping

Test case mappings to acceptance criteria are specific to the rack project. As more Acceptance Test Plans are added, the acceptance criteria remain common, but the mapping to test cases may differ. InstaGENI specific test case mappings will be captured in this section.

Note: The mappings for Administration and Monitoring test cases are being worked out and will be updated soon.

Administration Acceptance Test Requirements Mapping

IG-ADM-1: Rack Receipt and Inventory Test

IG-ADM-2: Rack Administrator Access Test

IG-ADM-3: Full Rack Reboot Test

IG-ADM-4: Emergency Stop Test

IG-ADM-5: Software Update Test

IG-ADM-6: Control Network Disconnection Test

IG-ADM-7: Documentation Review Test

Monitoring Acceptance Test Requirements Mapping

IG-MON-1: Control Network Software and VLAN Inspection Test

IG-MON-2: GENI Software Configuration Inspection Test

IG-MON-3: GENI Active Experiment Inspection Test

IG-MON-4: Infrastructure Device Performance Test

IG-MON-5: GMOC Data Collection Test

Experimenter Acceptance Test Requirements Mapping

IG-EXP-1: Bare Metal Support Acceptance Test

  • III.01. A recent Linux OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.b)
  • III.02. A recent Microsoft OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.d)

IG-EXP-2: InstaGENI Single Site Acceptance Test

  • I.02. If two experimenters have compute resources on the same rack, they cannot use the control plane to access each other's resources (e.g. via unauthenticated SSH, shared writable filesystem mount) (C.1.a)
  • I.03. If two experimenters reserve exclusive VLANs on the same rack, they cannot see or modify each other's dataplane traffic on those exclusive VLANs. (C.1.a)
  • I.04. An experimenter can create and access a sliver containing a bare-metal node on the rack. (C.1.b)
     * I.21. An experimenter can reserve a VM on the rack, and can run any command with root privileges within the VM OS, including loading a kernel module in any VM OS with a modular kernel. (G.1)
     <TU> I am not convinced that this will work, and have said so in my notes.
     <TU> We will ask about this in the call.
    
  • I.25. An experimenter can request a publically routable IP address or public TCP/UDP port mapping for the control interface of any compute resource in their sliver (subject to availability of IPs at the site), and can access their resource at that IP address and/or port from the commodity internet, and send traffic outbound from their resource to the internet. (F.1)
     * III.04. An experimenter can view a list of OS images which can be loaded on VMs by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
     <TU> I am not sure what we expect to happen here... which base OSes can be loaded on a shared node?  Is that what is listed?
     <TU> We think that an experimenter may be able to choose different userspace types (like Ubuntu userspace, etc), and we will ask about this on the call.
    
  • III.05. An experimenter can view a list of OS images which can be loaded on bare-metal nodes by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
     * III.06. The procedure for an experimenter to have new VM images added to the rack has been documented, including any restrictions on what OS images can be run on VMs. (C.1.e)
     <TU> Again, not sure what we expect here
     <TU> Maybe we need to ask about adding new userspace types on shared nodes.
    
  • III.07. The procedure for an experimenter to have new bare-metal images added to the rack has been documented, including any restrictions on what OS images can be run on bare-metal nodes. (C.1.e)

IG-EXP-3: InstaGENI Single Site Limits Test

  • I.01. 100 experimental VMs can run on the rack simultaneously (C.1.a)
  • I.18. If multiple VMs use the same physical interface to carry dataplane traffic, each VM has a distinct MAC address for that interface. (C.2.d, G.4)
     * III.03. A recent Linux OS image for VMs is provided by the rack team, and an experimenter can reserve and boot a VM using this image. (C.1.e)
     <TU> If they provide an image that runs OpenVZ, then that is good enough, right?
     <TU> This would by default provide at least some userspace type.
    

IG-EXP-4: InstaGENI Multi-site Acceptance Test

  • I.05. An experimenter can create and access a sliver containing both a VM and a bare-metal node. (C.1.c)
  • I.06. An experimenter can create and access a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN. (C.1.c)
  • I.07. If an experimenter creates a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN, the nodes can communicate on that VLAN. (C.1.c)
  • I.19. An experimenter can request an unbound exclusive VLAN from a local aggregate manager in the rack, and dataplane traffic can be sent on that VLAN and seen on the campus's primary dataplane switch. (C.2.e)
  • I.20. An experimenter can allocate and run a slice in which a compute resource in the rack sends traffic via an unbound exclusive rack VLAN, traversing a dynamically-created topology in a core L2 network, to another rack's dataplane. (C.2.e)
  • I.22. An experimenter can create and access an experiment which configures a bare-metal compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.23. An experimenter can create and access an experiment which configures a VM compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.24. An experimenter can create and access an experiment containing a compute resource with a dataplane interface, and can construct and send a non-IP ethernet packet over the dataplane interface. (G.3)

IG-EXP-5: InstaGENI OpenFlow Network Resources Acceptance Test

  • II.04. An experimenter can create a sliver including a subset of the traffic of a bound shared VLAN, define that subset using flowspace rules, and have that subset controlled by a controller that they run. (C.2.f)
  • II.08. If an experimenter creates an OpenFlow sliver on a shared VLAN, the experimenter's controller receives OpenFlow control requests only for traffic assigned to their sliver, and can successfully insert flowmods or send packet-outs only for traffic assigned to their sliver. (C.2.f)

IG-EXP-6: InstaGENI and Meso-scale Multi-site OpenFlow Acceptance Test

  • I.10. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a bound VLAN. (C.2.b, C.2.e)
  • I.11. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a shared VLAN, and can specify what IP address should be assigned to the dataplane interface connected to that shared VLAN. (C.2.b)
  • I.12. Two experimenters can create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound shared VLAN. (C.2.b, C.2.e)
  • I.15. An experimenter can create and access a sliver containing multiple compute resources in the rack with a dataplane interface on each connected to the same bound VLAN. (C.2.b)
  • I.17. An experimenter can create and access a sliver containing at least one bare-metal node and at least two VMs in the rack, and each compute resource must be allowed to have a dataplane interface on a single bound VLAN. (C.2.c)
  • II.01. If multiple VMs use the same physical interface to carry dataplane traffic, traffic between the VM dataplane interfaces can be OpenFlow-controlled. (C.2.d) (G.4)
  • II.05. An experimenter can run a controller (for any type of OpenFlow sliver) on an arbitrary system, accessible via the public Internet or via a private network which the rack FlowVisor can access, by specifying the DNS hostname (or IP address) and TCP port, as part of their sliver request. (C.2.f)
  • II.07. If an experimenter creates a sliver containing a compute resource with a dataplane interface on a shared VLAN, only the subset of traffic on the VLAN which has been assigned to their sliver is visible to the dataplane interface of their compute resource. (C.2.f)
  • II.09. Traffic only flows on the network resources assigned to experimenters' slivers as specified by the experimenters' controllers. No default controller, switch fail-open behavior, or other resource other than experimenters' controllers, can control how traffic flows on network resources assigned to experimenters' slivers. (C.2.f)
  • II.10. An experimenter can set the hard and soft timeout of flowtable entries that their controller adds to the switch. (C.2.f)
  • II.11. An experimenter's controller can get switch statistics and flowtable entries for their sliver from the switch. (C.2.f)
  • II.12. An experimenter's controller can get layer 2 topology information about their sliver, and about other slivers in their slice. (C.2.f)
  • II.13. An experimenter can access documentation about which OpenFlow actions can be performed in hardware. (C.2.f)
  • II.14. An experimenter can install flows that match only on layer 2 fields, and confirm whether the matching is done in hardware. (C.2.f)
  • II.15. If supported by the rack, an experimenter can install flows that match only on layer 3 fields, and confirm whether the matching is done in hardware. (C.2.f)

IG-EXP-7: Click Router Experiment Acceptance Test

Acceptance Criteria to Functional Group Mapping

This section groups acceptance criteria by high-level GENI rack functions.

I. Experiment and network resource criteria

These criteria define compute and network resources expected behavior for an experimenter.

  • I.01. 100 experimental VMs can run on the rack simultaneously (C.1.a)
  • I.02. If two experimenters have compute resources on the same rack, they cannot use the control plane to access each other's resources (e.g. via unauthenticated SSH, shared writable filesystem mount) (C.1.a)
  • I.03. If two experimenters reserve exclusive VLANs on the same rack, they cannot see or modify each other's dataplane traffic on those exclusive VLANs. (C.1.a)
  • I.04. An experimenter can create and access a sliver containing a bare-metal node on the rack. (C.1.b)
  • I.05. An experimenter can create and access a sliver containing both a VM and a bare-metal node. (C.1.c)
  • I.06. An experimenter can create and access a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN. (C.1.c)
  • I.07. If an experimenter creates a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN, the nodes can communicate on that VLAN. (C.1.c)
  • I.08. Experimenters can create and access slivers within the rack containing at least 100 distinct dataplane VLANs (C.2.a)
  • I.09. Experimenters can create and access slivers on two racks which simultaneously use all available unbound exclusive VLANs which can connect those racks (C.2.a)
  • I.10. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a bound VLAN. (C.2.b, C.2.e)
  • I.11. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a shared VLAN, and can specify what IP address should be assigned to the dataplane interface connected to that shared VLAN. (C.2.b)
  • I.12. Two experimenters can create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound shared VLAN. (C.2.b, C.2.e)
  • I.13. Two experimenters cannot create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound exclusive VLAN. (C.2.b, C.2.e)
  • I.14. An experimenter can create and access a sliver containing a compute resource in the rack with dataplane interfaces connected to multiple bound VLANs. (C.2.b)
  • I.15. An experimenter can create and access a sliver containing multiple compute resources in the rack with a dataplane interface on each connected to the same bound VLAN. (C.2.b)
  • I.16. An experimenter can create and access a sliver containing multiple compute resources in the rack with dataplane interfaces on each connected to multiple bound VLANs. (C.2.b)
  • I.17. An experimenter can create and access a sliver containing at least one bare-metal node and at least two VMs in the rack, and each compute resource must be allowed to have a dataplane interface on a single bound VLAN. (C.2.c)
  • I.18. If multiple VMs use the same physical interface to carry dataplane traffic, each VM has a distinct MAC address for that interface. (C.2.d, G.4)
  • I.19. An experimenter can request an unbound exclusive VLAN from a local aggregate manager in the rack, and dataplane traffic can be sent on that VLAN and seen on the campus's primary dataplane switch. (C.2.e)
  • I.20. An experimenter can allocate and run a slice in which a compute resource in the rack sends traffic via an unbound exclusive rack VLAN, traversing a dynamically-created topology in a core L2 network, to another rack's dataplane. (C.2.e)
  • I.21. An experimenter can reserve a VM on the rack, and can run any command with root privileges within the VM OS, including loading a kernel module in any VM OS with a modular kernel. (G.1)
  • I.22. An experimenter can create and access an experiment which configures a bare-metal compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.23. An experimenter can create and access an experiment which configures a VM compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
  • I.24. An experimenter can create and access an experiment containing a compute resource with a dataplane interface, and can construct and send a non-IP ethernet packet over the dataplane interface. (G.3)
  • I.25. An experimenter can request a publically routable IP address or public TCP/UDP port mapping for the control interface of any compute resource in their sliver (subject to availability of IPs at the site), and can access their resource at that IP address and/or port from the commodity internet, and send traffic outbound from their resource to the internet. (F.1)
  • I.26. An experimenter can request a sliver which creates a layer 2 path, between two dataplane interfaces on the rack switch which connect to non-rack resources (e.g. a bound or unbound VLAN between the core and a campus compute resource), without requesting any rack compute resources. (use case 4)
  • I.27. If the rack's control network cannot reach the control network at the rack vendor's home site but is working otherwise, an experimenter can create and access an experimental sliver containing a compute resource and a VLAN (assuming the experimenter's slice authority is reachable). (E)
  • I.28. An experimenter can create an experimental sliver containing a compute resource and a VLAN, and can verify that the sliver is continuously accessible for one week. (E)

II. Experiment OpenFlow criteria

These criteria define OpenFlow resources expected behavior for an experimenter.

  • II.01. If multiple VMs use the same physical interface to carry dataplane traffic, traffic between the VM dataplane interfaces can be OpenFlow-controlled. (C.2.d) (G.4)
  • II.02. An experimenter can create a sliver including an unbound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
  • II.03. An experimenter can create a sliver including a bound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
  • II.04. An experimenter can create a sliver including a subset of the traffic of a bound shared VLAN, define that subset using flowspace rules, and have that subset controlled by a controller that they run. (C.2.f)
  • II.05. An experimenter can run a controller (for any type of OpenFlow sliver) on an arbitrary system, accessible via the public Internet or via a private network which the rack Flowvisor can access, by specifying the DNS hostname (or IP address) and TCP port, as part of their sliver request. (C.2.f)
  • II.06. An experimenter can run a controller (for any type of OpenFlow sliver) on a compute resource which they're requesting at the same time as the OpenFlow sliver, by specifying in the request which compute resource they want to use. (The AM can define how unbound this can be, e.g. "one of the VMs", or "the VM with this client_id", or whatever.) (C.2.f)
  • II.07. If an experimenter creates a sliver containing a compute resource with a dataplane interface on a shared VLAN, only the subset of traffic on the VLAN which has been assigned to their sliver is visible to the dataplane interface of their compute resource. (C.2.f)
  • II.08. If an experimenter creates an OpenFlow sliver on a shared VLAN, the experimenter's controller receives OpenFlow control requests only for traffic assigned to their sliver, and can successfully insert flowmods or send packet-outs only for traffic assigned to their sliver. (C.2.f)
  • II.09. Traffic only flows on the network resources assigned to experimenters' slivers as specified by the experimenters' controllers. No default controller, switch fail-open behavior, or other resource other than experimenters' controllers, can control how traffic flows on network resources assigned to experimenters' slivers. (C.2.f)
  • II.10. An experimenter can set the hard and soft timeout of flowtable entries that their controller adds to the switch. (C.2.f)
  • II.11. An experimenter's controller can get switch statistics and flowtable entries for their sliver from the switch. (C.2.f)
  • II.12. An experimenter's controller can get layer 2 topology information about their sliver, and about other slivers in their slice. (C.2.f)
  • II.13. An experimenter can access documentation about which OpenFlow actions can be performed in hardware. (C.2.f)
  • II.14. An experimenter can install flows that match only on layer 2 fields, and confirm whether the matching is done in hardware. (C.2.f)
  • II.15. An experimenter can install flows that match only on layer 3 fields, and confirm whether the matching is done in hardware. (C.2.f)

I've asked Niky for feedback about what else experimenters would expect to be able to do with OpenFlow (i.e. other things like II.10).

III. Compute resource image criteria

These criteria define operating system images expected behavior for rack resources.

  • III.01. A recent Linux OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.b)
  • III.02. A recent Microsoft OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.d)
  • III.03. A recent Linux OS image for VMs is provided by the rack team, and an experimenter can reserve and boot a VM using this image. (C.1.e)
  • III.04. An experimenter can view a list of OS images which can be loaded on VMs by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
  • III.05. An experimenter can view a list of OS images which can be loaded on bare-metal nodes by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
  • III.06. The procedure for an experimenter to have new VM images added to the rack has been documented, including any restrictions on what OS images can be run on VMs. (C.1.e)
  • III.07. The procedure for an experimenter to have new bare-metal images added to the rack has been documented, including any restrictions on what OS images can be run on bare-metal nodes. (C.1.e)

IV. Rack infrastructure support

These criteria define expected rack infrastructure functions that MUST be provided:

  • IV.01. All experimental hosts are configured to boot (rather than stay off pending manual intervention) when they are cleanly shut down and then remotely power-cycled. (C.3.c)
  • IV.02. Mesoscale reachability testing can report on the recent liveness of the rack bound VLANs by pinging a per-rack IP in each mesoscale monitoring subnet. (D.8)
  • IV.03. When changes to the Emergency Stop procedure are approved at a GEC, rack vendors implement any needed AM software and site administrator documentation modifications which are needed to support these changes within 3 months.
  • IV.04. When all aggregates and services are running on the primary rack server, the host's performance is good enough that OpenFlow monitoring does not lose data (due to an overloaded FOAM or FlowVisor) and does not report visible dataplane problems (due to an overloaded FlowVisor). (open questions ExoGENI B-6, InstaGENI B.4)

V. Site administrator access

These criteria define expected local and remote access to rack control infrastructure:

  • V.01. For all rack infrastructure Unix hosts, including rack servers (both physical and VM) and experimental VM servers, site administrators should be able to login at a console (physical or virtual). (C.3.a)
  • V.02. Site administrators can login to all rack infrastructure Unix hosts using public-key SSH. (C.3.a, C.3.b)
  • V.03. Site administrators cannot login to any rack infrastructure Unix hosts using password-based SSH, nor via any unencrypted login protocol. (C.3.a)
  • V.04. Site administrators can run any command with root privileges on all rack infrastructure Unix hosts. (C.3.a)
  • V.05. Site administrators can login to all network-accessible rack infrastructure devices (network devices, remote KVMs, remote PDUs, etc) via serial console and via SSH. (C.3.a, C.3.b)
  • V.06. Site administrators cannot login to any network-accessible rack device via an unencrypted login protocol. (C.3.a)
  • V.07. If site operator privileges are defined, each type of operator access is documented, and the documented restrictions are testable. (C.3.a)
  • V.08. If experimenters have access to any rack infrastructure, the access is restricted to protect the infrastructure. The restrictions are documented, and the documented restrictions are testable. (C.3.a)
  • V.09. When the rack control network is partially down or the rack vendor's home site is inaccessible from the rack, it is still possible to access the primary control network device and server for recovery. All devices/networks which must be operational in order for the control network switch and primary server to be reachable, are documented. (C.3.b)
  • V.10. Site administrators can authenticate remotely and power on, power off, or power-cycle, all physical rack devices, including experimental hosts, servers, and network devices. (C.3.c)
  • V.11. Site administrators can authenticate remotely and virtually power on, power off, or power-cycle all virtual rack resources, including server and experimental VMs. (C.3.c)

VI. Site administrator documentation

These criteria define expected site administrators and operators documentation:

  • VI.01. The rack development team has documented a technical plan for handing off primary rack operations to site operators. (E)
  • VI.02. A public document contains a parts list for each rack. (F.1)
  • VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
  • VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
  • VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
  • VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
  • VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with LLR, how to best contact the rack vendor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
  • VI.08. A public document explains how to use the GMOC ticket system to report rack changes or problems, and under what circumstances a GMOC ticket needs to be opened. (F.4)
  • VI.09. A public document explains how to identify the software versions and system file configurations running on the rack, and how to get information about recent changes to the rack software and configuration. (F.5)
  • VI.10. A public document explains how and when software and OS updates can be performed on the rack, including plans for notification and update if important security vulnerabilities in rack software are discovered. (F.5)
  • VI.11. A public document describes the GENI software running on the rack, and explains how to get access to the source code of each piece of GENI software. (F.6)
  • VI.12. A public document describes all the GENI experimental resources within the rack, and explains what policy options exist for each, including: how to configure rack nodes as bare-metal vs. VM server, what options exist for configuring automated approval of compute and network resource requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, whether it is possible to trust local users within the rack. (F.7)
  • VI.13. A public document describes the expected state of all the GENI experimental resources in the rack, including how to determine the state of an experimental resource and what state is expected for an unallocated bare-metal node. (F.5)
  • VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)
  • VI.15. A procedure is documented for changing IP addresses for all rack components. (C.3.e)
  • VI.16. A procedure is documented for cleanly shutting down the entire rack in case of a scheduled site outage. (C.3.c)
  • VI.17. A procedure is documented for performing a shutdown operation on any type of sliver on the rack, in support of an Emergency Stop request. (C.3.d)

VII. Site administrator procedures

These criteria define rack operation functions expected to be performed by site administrators:

  • VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
  • VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)
  • VII.03. Site administrators can understand the expected control and dataplane network behavior of their rack. (F.2)
  • VII.04. Site administrators can view and investigate current system and network activity on their rack. (F.2)
  • VII.05. The rack development team and GPO are able to use the GMOC ticket system to communicate with each other, and provide feedback to GMOC about any issues. (F.4)
  • VII.06. A site administrator can verify the control software and configurations on the rack at some point in time. (F.5)
  • VII.07. A site administrator can perform software and OS updates on the rack. (F.5)
  • VII.08. A site administrator can get access to source code for the version of each piece of GENI code installed on their site rack at some point in time. (F.6)
  • VII.09. A site administrator can determine the MAC addresses of all physical host interfaces, all network device interfaces, all active experimental VMs, and all recently-terminated experimental VMs. (C.3.f)
  • VII.10. A site administrator can locate current and recent CPU and memory utilization for each rack network device, and can find recent changes or errors in a log. (D.6.a)
  • VII.11. A site administrator can locate current configuration of flowvisor, FOAM, and any other OpenFlow services, and find logs of recent activity and changes. (D.6.a)
  • VII.12. For each infrastructure and experimental host, a site administrator can locate current and recent uptime, CPU, disk, and memory utilization, interface traffic counters, process counts, and active user counts. (D.6.b)
  • VII.13. A site administrator can locate recent syslogs for all infrastructure and experimental hosts. (D.6.b)
  • VII.14. A site administrator can locate information about the network reachability of all rack infrastructure which should live on the control network, and can get alerts when any rack infrastructure control IP becomes unavailable from the rack server host, or when the rack server host cannot reach the commodity internet. (D.6.c)
  • VII.15. Some rack internal health checks are run on a regular basis, and a site administrator can get emailed alerts when these checks fail. (F.8)
  • VII.16. A public document explains how to perform comprehensive health checks for a rack (or, if those health checks are being run automatically, how to view the current/recent results). (F.8)
  • VII.17. A site administrator can get information about the power utilization of rack PDUs. (D.6.c)
  • VII.18. Given a public IP address and port, an exclusive VLAN, a sliver name, or a piece of user-identifying information such as e-mail address or username, a site administrator or GMOC operator can identify the email address, username, and affiliation of the experimenter who controlled that resource at a particular time. (D.7)
  • VII.19. GMOC and a site administrator can perform a successful Emergency Stop drill in which slivers containing compute and OpenFlow-controlled network resources are shut down. (C.3.d)

VIII. Monitoring functionality

These criteria define expected centralized GENI monitoring and trending features:

  • VIII.01. Operational monitoring data for the rack is available at gmoc-db.grnoc.iu.edu. (D.2)
  • VIII.02. The rack data's "site" tag in the GMOC database indicates the physical location (e.g. host campus) of the rack. (D.2)
  • VIII.03. Whenever the rack is operational, GMOC's database contains site monitoring data which is at most 10 minutes old. (D.3)
  • VIII.04. Any site variable which can be collected by reading a counter (i.e. which does not require system or network processing beyond a file read) is collected by local rack monitoring at least once a minute. (D.3)
  • VIII.05. All hosts which submit data to gmoc-db have system clocks which agree with gmoc-db's clock to within 45 seconds. (GMOC is responsible for ensuring that gmoc-db's own clock is synchronized to an accurate time source.) (D.4)
  • VIII.06. The GMOC database contains data about whether each site AM has recently been reachable via the GENI AM API. (D.5.a)
  • VIII.07. The GMOC database contains data about the recent uptime and availability of each compute or unbound VLAN resource at each rack AM. (D.5.a)
  • VIII.08. The GMOC database contains the sliver count and percentage of resources in use at each rack AM. (D.5.a)
  • VIII.09. The GMOC database contains the creation time of each sliver on each rack AM. (D.5.a)
  • VIII.10. If possible, the GMOC database contains per-sliver interface counters for each rack AM. (D.5.a)
  • VIII.11. The GMOC database contains data about whether each rack dataplane switch has recently been online. (D.5.b)
  • VIII.12. The GMOC database contains recent traffic counters and VLAN memberships for each rack dataplane switch interface. (D.5.b)
  • VIII.13. The GMOC database contains recent MAC address table contents for shared VLANs which appear on rack dataplane switches (D.5.b)
  • VIII.14. The GMOC database contains data about whether each experimental VM server has recently been online. (D.5.c)
  • VIII.15. The GMOC database contain overall CPU, disk, and memory utilization, and VM count and capacity, for each experimental VM server. (D.5.c)
  • VIII.16. The GMOC database contains overall interface counters for experimental VM server dataplane interfaces. (D.5.c)
  • VIII.17. The GMOC database contains recent results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack. (D.5.d)
  • VIII.18. For trending purposes, per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on a given rack. Racks may provide raw sliver/user data to GMOC, or may produce their own trending summaries on demand. (D.7)

Requirements to Acceptance Criteria Mapping

This section groups acceptance criteria for each GENI Rack Requirement.

Integration Requirements (C)

Requirement:

C.1 Compute Resource Requirements: C.1.a "Support operations for at least 100 simultaneously used virtual compute resources. Implement functions to verify isolation of simultaneously used resources."

Acceptance Criteria:

I.01. 100 experimental VMs can run on the rack simultaneously (C.1.a)
I.02. If two experimenters have compute resources on the same rack, they cannot use the control plane to access each other's resources (e.g. via unauthenticated SSH, shared writable filesystem mount) (C.1.a)
I.03. If two experimenters reserve exclusive VLANs on the same rack, they cannot see or modify each other's dataplane traffic on those exclusive VLANs. (C.1.a)

Requirement:

C.1.b "Support configuration options and operations for at least one bare metal compute resource in each rack."

Acceptance Criteria:

I.04. An experimenter can create and access a sliver containing a bare-metal node on the rack. (C.1.b)
III.01. A recent Linux OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.b)

Requirement:

C.1.c "Support configuration options for operating virtual and bare metal compute resources simultaneously in a single rack."

Acceptance Criteria:

I.05. An experimenter can create and access a sliver containing both a VM and a bare-metal node. (C.1.c)
I.06. An experimenter can create and access a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN. (C.1.c)
I.07. If an experimenter creates a sliver containing a VM and a bare-metal node, each with a dataplane interface on the same VLAN, the nodes can communicate on that VLAN. (C.1.c)

Requirement:

C.1.d Support the ability to run Microsoft operating system for a user application on bare metal nodes. (Microsoft is requested but not required on VMs) .

Acceptance Criteria:

II.02. A recent Microsoft OS image for bare-metal nodes is provided by the rack team, and an experimenter can reserve and boot a bare-metal node using this image. (C.1.d)

Requirement:

C.1.e Identify any restrictions on the types of supported Operating Systems that users can run on VMs or bare metal nodes.

Acceptance Criteria:

III.03. A recent Linux OS image for VMs is provided by the rack team, and an experimenter can reserve and boot a VM using this image. (C.1.e)
III.04. An experimenter can view a list of OS images which can be loaded on VMs by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
III.05. An experimenter can view a list of OS images which can be loaded on bare-metal nodes by requesting an advertisement RSpec from the compute aggregate. (C.1.e)
III.06. The procedure for an experimenter to have new VM images added to the rack has been documented, including any restrictions on what OS images can be run on VMs. (C.1.e)
III.07. The procedure for an experimenter to have new bare-metal images added to the rack has been documented, including any restrictions on what OS images can be run on bare-metal nodes. (C.1.e)

Requirement:

C.2 Network resource and experimental connectivity requirements: C.2.a "Support at least 100 simultaneous active (e.g. actually passing data) layer 2 Ethernet VLAN connections to the rack. For this purpose, VLAN paths must terminate on separate rack VMs, not on the rack switch."

Acceptance Criteria:

I.08. Experimenters can create and access slivers within the rack containing at least 100 distinct dataplane VLANs (C.2.a)
I.09. Experimenters can create and access slivers on two racks which simultaneously use all available unbound exclusive VLANs which can connect those racks (C.2.a)

Requirement:

C.2.b Be able to connect a single VLAN from a network external to the rack to multiple VMs in the rack. Do this for multiple external VLANs simultaneously. (Measurement and experimental monitoring VLANs are examples of VLANs that may need to connect to all active VMs simultaneously.) [[ BR]]

Acceptance Criteria:

I.12. Two experimenters can create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound shared VLAN. (C.2.b, C.2.e)
I.13. Two experimenters cannot create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound exclusive VLAN. (C.2.b, C.2.e)
I.14. An experimenter can create and access a sliver containing a compute resource in the rack with dataplane interfaces connected to multiple bound VLANs. (C.2.b)
I.15. An experimenter can create and access a sliver containing multiple compute resources in the rack with a dataplane interface on each connected to the same bound VLAN. (C.2.b)
I.16. An experimenter can create and access a sliver containing multiple compute resources in the rack with dataplane interfaces on each connected to multiple bound VLANs. (C.2.b)

Requirement:

C.2.c Be able to connect a single VLAN from a network external to the rack to multiple VMs and a bare metal compute resource in the rack simultaneously.

Acceptance Criteria:

I.17. An experimenter can create and access a sliver containing at least one bare-metal node and at least two VMs in the rack, and each compute resource must be allowed to have a dataplane interface on a single bound VLAN. (C.2.c)

Requirement:

C.2.d Support individual addressing for each VM (e.g. IP and MAC addresses per VM should appear unique for experiments that require full dataplane virtualization.)

Acceptance Criteria:

I.18. If multiple VMs use the same physical interface to carry dataplane traffic, each VM has a distinct MAC address for that interface. (C.2.d, G.4)
II.01. If multiple VMs use the same physical interface to carry dataplane traffic, traffic between the VM dataplane interfaces can be OpenFlow-controlled. (C.2.d) (G.4)

Requirement:

C.2.e "Support AM API options to allow both static (pre-defined VLAN number) and dynamic (negotiated VLAN number e.g. interface to DYNES, enhanced SHERPA) configuration options for making GENI layer 2 connections "

Acceptance Criteria:

I.10. An experimenter can create and access a sliver containing a compute resource in the rack with a dataplane interface connected to a bound VLAN. (C.2.b, C.2.e)
I.12. Two experimenters can create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound shared VLAN. (C.2.b, C.2.e)
I.13. Two experimenters cannot create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound exclusive VLAN. (C.2.b, C.2.e)
I.19. An experimenter can request an unbound exclusive VLAN from a local aggregate manager in the rack, and dataplane traffic can be sent on that VLAN and seen on the campus's primary dataplane switch. (C.2.e)
I.20. An experimenter can allocate and run a slice in which a compute resource in the rack sends traffic via an unbound exclusive rack VLAN, traversing a dynamically-created topology in a core L2 network, to another rack's dataplane. (C.2.e)

Requirement:

C.2.f Support the ability to run multiple OpenFlow controllers in the rack, and allow the site administrators (or their rack team representatives during early integration) to determine which controllers can affect the rack's OpenFlow switch. Note, site administrators may choose to def ault approve ALL controllers, but we do not expect that option to cover local site policy for all racks).

Acceptance Criteria:

II.02. An experimenter can create a sliver including an unbound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
II.03. An experimenter can create a sliver including a bound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
II.04. An experimenter can create a sliver including a subset of the traffic of a bound shared VLAN, define that subset using flowspace rules, and have that subset controlled by a controller that they run. (C.2.f)
II.05. An experimenter can run a controller (for any type of OpenFlow sliver) on an arbitrary system, accessible via the public Internet or via a private network which the rack Flowvisor can access, by specifying the DNS hostname (or IP address) and TCP port, as part of their sliver req uest. (C.2.f)
II.06. An experimenter can run a controller (for any type of OpenFlow sliver) on a compute resource which they're requesting at the same time as the OpenFlow sliver, by specifying in the request which compute resource they want to use. (The AM can define how unbound this can be, e.g. "o ne of the VMs", or "the VM with this client_id", or whatever.) (C.2.f)
II.07. If an experimenter creates a sliver containing a compute resource with a dataplane interface on a shared VLAN, only the subset of traffic on the VLAN which has been assigned to their sliver is visible to the dataplane interface of their compute resource. (C.2.f)
II.08. If an experimenter creates an OpenFlow sliver on a shared VLAN, the experimenter's controller receives OpenFlow control requests only for traffic assigned to their sliver, and can successfully insert flowmods or send packet-outs only for traffic assigned to their sliver. (C.2.f) [[ BR]] II.09. Traffic only flows on the network resources assigned to experimenters' slivers as specified by the experimenters' controllers. No default controller, switch fail-open behavior, or other resource other than experimenters' controllers, can control how traffic flows on network resourc es assigned to experimenters' slivers. (C.2.f)
II.10. An experimenter can set the hard and soft timeout of flowtable entries that their controller adds to the switch. (C.2.f)
II.11. An experimenter's controller can get switch statistics and flowtable entries for their sliver from the switch. (C.2.f)
II.12. An experimenter's controller can get layer 2 topology information about their sliver, and about other slivers in their slice. (C.2.f)
II.13. An experimenter can access documentation about which OpenFlow actions can be performed in hardware. (C.2.f)
II.14. An experimenter can install flows that match only on layer 2 fields, and confirm whether the matching is done in hardware. (C.2.f)
II.15. An experimenter can install flows that match only on layer 3 fields, and confirm whether the matching is done in hardware. (C.2.f)

Requirement:

C.3 Rack resource management requirements: C.3.a Provide Admin, Operator, and User accounts to manage access to rack resources. Admin privileges should be similar to super user, Operator should provide access to common operator functions such as debug tools, emergency stop etc. not available to users. Access to all accounts must be more secure than username/password (e.g. require SSH key) .

Acceptance Criteria:

V.01. For all rack infrastructure Unix hosts, including rack servers (both physical and VM) and experimental VM servers, site administrators should be able to login at a console (physical or virtual). (C.3.a)
V.02. Site administrators can login to all rack infrastructure Unix hosts using public-key SSH. (C.3.a, C.3.b)
V.03. Site administrators cannot login to any rack infrastructure Unix hosts using password-based SSH, nor via any unencrypted login protocol. (C.3.a)
V.04. Site administrators can run any command with root privileges on all rack infrastructure Unix hosts. (C.3.a)
V.05. Site administrators can login to all network-accessible rack infrastructure devices (network devices, remote KVMs, remote PDUs, etc) via serial console and via SSH. (C.3.a, C.3.b)
V.06. Site administrators cannot login to any network-accessible rack device via an unencrypted login protocol. (C.3.a)
V.07. If site operator privileges are defined, each type of operator access is documented, and the documented restrictions are testable. (C.3.a)
V.08. If experimenters have access to any rack infrastructure, the access is restricted to protect the infrastructure. The restrictions are documented, and the documented restrictions are testable. (C.3.a)
VI.14. A procedure is documented for creating new site administrator and operator accounts. (C.3.a)

Requirement:

C.3.b Support remote access to Admin and Operator accounts for local site admins and GENI operations staff with better than username/password control (e.g. remote terminal access with SSH key)

Acceptance Criteria:

V.02. Site administrators can login to all rack infrastructure Unix hosts using public-key SSH. (C.3.a, C.3.b)
V.05. Site administrators can login to all network-accessible rack infrastructure devices (network devices, remote KVMs, remote PDUs, etc) via serial console and via SSH. (C.3.a, C.3.b)
V.09. When the rack control network is partially down or the rack vendor's home site is inaccessible from the rack, it is still possible to access the primary control network device and server for recovery. All devices/networks which must be operational in order for the control network sw itch and primary server to be reachable, are documented. (C.3.b)

Requirement:

C.3.c Implement functions needed for remotely-controlled power cycling the rack components, and ensure that rack components reboot automatically after a power failure or intentional power cycle.

Acceptance Criteria:

IV.01. All experimental hosts are configured to boot (rather than stay off pending manual intervention) when they are cleanly shut down and then remotely power-cycled. (C.3.c)
V.10. Site administrators can authenticate remotely and power on, power off, or power-cycle, all physical rack devices, including experimental hosts, servers, and network devices. (C.3.c)
V.11. Site administrators can authenticate remotely and virtually power on, power off, or power-cycle all virtual rack resources, including server and experimental VMs. (C.3.c)
VI.16. A procedure is documented for cleanly shutting down the entire rack in case of a scheduled site outage. (C.3.c)

Requirement:

C.3.d Implement functions needed to execute current Emergency Stop (see GpoDoc) . Rack teams are expected to implement changes to the GENI adopted Emergency Stop procedure and deploy them in racks no more than 3 months after they are approved at a GEC.

Acceptance Criteria:

VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with LLR, how to best contact the rack v endor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)
VI.17. A procedure is documented for performing a shutdown operation on any type of sliver on the rack, in support of an Emergency Stop request. (C.3.d)
VII.19. GMOC and a site administrator can perform a successful Emergency Stop drill in which slivers containing compute and OpenFlow-controlled network resources are shut down. (C.3.d)

Requirement:

C.3.e Provide remote ability to determine and change active IP addresses on all addressable rack resources (including VMs) .

Acceptance Criteria:

VI.15. A procedure is documented for changing IP addresses for all rack components. (C.3.e)

Requirement:

C.3.f Provide remote ability to determine MAC addresses for all rack resources (including VMs) .

Acceptance Criteria:

VII.09. A site administrator can determine the MAC addresses of all physical host interfaces, all network device interfaces, all active experimental VMs, and all recently-terminated experimental VMs. (C.3.f)

Monitoring Requirements (D)

Requirement:

D.1 Current monitoring practices for GMOC reporting are documented at at http://groups.geni.net/geni/wiki/PlasticSlices/MonitoringRecommendations#Sitemonitoringarchitecture. The measurement sender is documented at http://gmoc-db.grnoc.iu.edu/sources/measurement_api/measurement_sender. pl and an example configuration file is is http://gmoc-db.grnoc.iu.edu/sources/measurement_api/

Acceptance Criteria:

Requirement D.1 is informational only and does not contain any acceptance criteria.

Requirement:

D.2 "Data may be submitted as time series via the GMOC data submission API, or GMOC may poll for the data, at the preference of the rack vendors and GMOC."

Acceptance Criteria:

VIII.01. Operational monitoring data for the rack is available at gmoc-db.grnoc.iu.edu. (D.2)
VIII.02. The rack data's "site" tag in the GMOC database indicates the physical location (e.g. host campus) of the rack. (D.2)

Requirement:

D.3 Data must be submitted at least once every 10 minutes per rack, and may be submitted as often as desired. Simple data should be collected every minute; complex checks may be done less frequently.

Acceptance Criteria:

VIII.03. Whenever the rack is operational, GMOC's database contains site monitoring data which is at most 10 minutes old. (D.3)
VIII.04. Any site variable which can be collected by reading a counter (i.e. which does not require system or network processing beyond a file read) is collected by local rack monitoring at least once a minute. (D.3)

Requirement:

D.4 Timestamps on submitted data must be accurate to within 1 second

Acceptance Criteria:

VIII.05. All hosts which submit data to gmoc-db have system clocks which agree with gmoc-db's clock to within 45 seconds. (GMOC is responsible for ensuring that gmoc-db's own clock is synchronized to an accurate time source.) (D.4)

Requirement:

D.5 "The following types of data are of interest to GENI users, and must be collected and reported to GMOC:" D.5.a Health of aggregates: whether the AM is up and reachable via the AM API, what resources of what types the AM has and what their state is (in use, available, down/unknown) , overall sliver count and resource utilization level on the aggregate, status and utilization of each sliver

active on the aggregate (minimum: sliver uptime, sliver resource utilization, performance data as available)

Acceptance Criteria:

VIII.06. The GMOC database contains data about whether each site AM has recently been reachable via the GENI AM API. (D.5.a)
VIII.07. The GMOC database contains data about the recent uptime and availability of each compute or unbound VLAN resource at each rack AM. (D.5.a)
VIII.08. The GMOC database contains the sliver count and percentage of resources in use at each rack AM. (D.5.a)
VIII.09. The GMOC database contains the creation time of each sliver on each rack AM. (D.5.a)
VIII.10. If possible, the GMOC database contains per-sliver interface counters for each rack AM. (D.5.a)

Requirement:

D.5.b "Health of network devices: liveness, interface traffic counters (including types of traffic

e.g. broadcast/multicast) , VLANs defined on interfaces, MAC address tables on data plane VLANs"

Acceptance Criteria:

VIII.11. The GMOC database contains data about whether each rack dataplane switch has recently been online. (D.5.b)
VIII.12. The GMOC database contains recent traffic counters and VLAN memberships for each rack dataplane switch interface. (D.5.b)
VIII.13. The GMOC database contains recent MAC address table contents for shared VLANs which appear on rack dataplane switches (D.5.b)

Requirement:

D.5.c Health of hosts which serve experimental VMs: liveness, CPU/disk/memory utilization, interface counters on dataplane interfaces, VM count and capacity

Acceptance Criteria:

VIII.14. The GMOC database contains data about whether each experimental VM server has recently been online. (D.5.c)
VIII.15. The GMOC database contain overall CPU, disk, and memory utilization, and VM count and capacity, for each experimental VM server. (D.5.c)
VIII.16. The GMOC database contains overall interface counters for experimental VM server dataplane interfaces. (D.5.c)

Requirement:

D.5.d Run (and report the results of) health checks that create and use VMs and OpenFlow connections within the rack, at least hourly.

Acceptance Criteria:

VIII.17. The GMOC database contains recent results of at least one end-to-end health check which simulates an experimenter reserving and using at least one resource in the rack. (D.5.d)

Requirement:

D.6 The following types of data are operationally relevant and may be of interest to GENI users. Racks should collect these for their own use, and are encouraged to submit them to GMOC for aggregation and problem debugging: D.6.a Health of network devices: CPU and memory utilization, openflow configuration and status

Acceptance Criteria:

VII.10. A site administrator can locate current and recent CPU and memory utilization for each rack network device, and can find recent changes or errors in a log. (D.6.a)
VII.11. A site administrator can locate current configuration of flowvisor, FOAM, and any other OpenFlow services, and find logs of recent activity and changes. (D.6.a)

Requirement:

D.6.b Health of all hosts: liveness, CPU/disk/memory utilization, interface traffic counters, uptime, process counts, active user counts

Acceptance Criteria:

VII.12. For each infrastructure and experimental host, a site administrator can locate current and recent uptime, CPU, disk, and memory utilization, interface traffic counters, process counts, and active user counts. (D.6.b)
VII.13. A site administrator can locate recent syslogs for all infrastructure and experimental hosts. (D.6.b)

Requirement:

D.6.c Rack infrastructure: power utilization, control network reachability of all infrastructure devices (KVMs, PDUs, etc) , reachability of commodity internet, reachability of GENI data plane if testable

Acceptance Criteria:

VII.14. A site administrator can locate information about the network reachability of all rack infrastructure which should live on the control network, and can get alerts when any rack infrastructure control IP becomes unavailable from the rack server host, or when the rack server host ca nnot reach the commodity internet. (D.6.c)
VII.17. A site administrator can get information about the power utilization of rack PDUs. (D.6.c)

Requirement:

D.7 Log and report total number of active users on a rack and identifiers for those users, where the total is measured over the time period that has elapsed since the last report. Report at least twice daily. It must be possible to identify an email address for each individual user id entifier, but it is acceptable to require additional information that is not public or available on the rack to identify users by email, as long as that information is available to site support staff and GENI operations staff.

Acceptance Criteria:

VII.18. Given a public IP address and port, an exclusive VLAN, a sliver name, or a piece of user-identifying information such as e-mail address or username, a site administrator or GMOC operator can identify the email address, username, and affiliation of the experimenter who controlled t hat resource at a particular time. (D.7)
VIII.18. For trending purposes, per-rack or per-aggregate summaries are collected of the count of distinct users who have been active on a given rack. Racks may provide raw sliver/user data to GMOC, or may produce their own trending summaries on demand. (D.7)

Requirement:

D.8 Each rack must provide a static (always-available) interface on each GENI mesoscale VLAN, which can be used for reachability testing of the liveness of the mesoscale connection to the rack.

Acceptance Criteria:

IV.02. Mesoscale reachability testing can report on the recent liveness of the rack bound VLANs by pinging a per-rack IP in each mesoscale monitoring subnet. (D.8)

Production Aggregate Requirements (E)

Requirement:
  1. Because GENI racks are meant to be used by experimenters after a very short integration period, GENI racks must meet current GENI production aggregate equirements, beginning at the time of the rack's deployment. The rack production team may choose to take initial responsibility fo

r the local site aggregate to meet these requirements during initial installation and integration, but must hand off responsibility to the local site within 3 months. See details in the Local Aggregate Owner Requirements section.

Acceptance Criteria:

I.27. If the rack's control network cannot reach the control network at the rack vendor's home site but is working otherwise, an experimenter can create and access an experimental sliver containing a compute resource and a VLAN (assuming the experimenter's slice authority is reachable). (E)
I.28. An experimenter can create an experimental sliver containing a compute resource and a VLAN, and can verify that the sliver is continuously accessible for one week. (E)
VI.01. The rack development team has documented a technical plan for handing off primary rack operations to site operators. (E)

Local Aggregate Requirements (F)

Requirement:

F.1 Sites provide space, power (preferably with backup) , and air conditioning, network connection for both layer2 (Ethernet VLAN) and layer3 data plane connections to GENI, and publicly routeable IP addresses for the rack (separate control and data plane ranges). Addresses should be on an unrestricted network segment outside any site firewalls. All addresses dedicated to the nodes should have registered DNS names and support both forward and reverse lookup. Racks should not be subject to port filtering. Rack teams must document their minimum requirements for all site-provided services (e.g. number of required IP addresses) and provide complete system installation documentation on a public website (e.g. the GENI wiki).

Acceptance Criteria:

I.25. An experimenter can request a publically routable IP address or public TCP/UDP port mapping for the control interface of any compute resource in their sliver (subject to availability of IPs at the site) , and can access their resource at that IP address and/or port from the commodi ty internet, and send traffic outbound from their resource to the internet. (F.1)
VI.02. A public document contains a parts list for each rack. (F.1)
VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each P DU. (F.1)
VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or els ewhere on the site, and what firewall configuration is desired for the control network. (F.1)
VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
VII.01. Using the provided documentation, GPO is able to successfully power and wire their rack, and to configure all needed IP space within a per-rack subdomain of gpolab.bbn.com. (F.1)
VII.02. Site administrators can understand the physical power, console, and network wiring of components inside their rack and document this in their preferred per-site way. (F.1)

Requirement:

F.2 Sites must operate racks according to the most recent version of the GENI Aggregate Provider's Agreement and the GENI Recommended Use Policy (see GpoDoc) . Rack teams must implement functions that allow site and GENI operations staff to monitor all rack nodes for intrusion attempts and abnormal behavior to support execution of the GENI Recommended Use Policy.

Acceptance Criteria:

VII.03. Site administrators can understand the expected control and dataplane network behavior of their rack. (F.2)
VII.04. Site administrators can view and investigate current system and network activity on their rack. (F.2)

Requirement:

F.3 "Sites must provide at least two support contacts (preferably via a mailing list) , who can respond to issues reported to GENI or proactively detected by monitoring. These contacts will join the GENI response-team@geni.net mailing list. Rack teams must implement functions that the site contacts can use to assist with incident response (e.g. site administrator accounts and tools). In particular, rack teams must support functions needed to respond to Legal, Law Enforcement and Regulatory issues with the GENI LLR Representative (e.g. ability to identify rack users by email address)."

Acceptance Criteria:

VI.07. A public document explains the requirements that site administrators have to the GENI community, including how to join required mailing lists, how to keep their support contact information up-to-date, how and under what circumstances to work with LLR, how to best contact the rack v endor with operational problems, what information needs to be provided to GMOC to support emergency stop, and how to interact with GMOC when an Emergency Stop request is received. (F.3, C.3.d)

Requirement:

F.4 Sites must inform GENI operations of actions they take on any open GENI issue report via the GMOC ticketing system. Rack teams should not establish parallel reporting requirements, and should not change deployed systems without opening and maintaining GMOC tracking tickets.

Acceptance Criteria:

VI.08. A public document explains how to use the GMOC ticket system to report rack changes or problems, and under what circumstances a GMOC ticket needs to be opened. (F.4)
VII.05. The rack development team and GPO are able to use the GMOC ticket system to communicate with each other, and provide feedback to GMOC about any issues. (F.4)

Requirement:

F.5 Site support staff (and GENI operations) must be able to identify all software versions and view all configurations running on all GENI rack components once they are deployed. The rack users' experimental software running on the rack is exempt from this requirement.

Acceptance Criteria:

VI.09. A public document explains how to identify the software versions and system file configurations running on the rack, and how to get information about recent changes to the rack software and configuration. (F.5)
VI.10. A public document explains how and when software and OS updates can be performed on the rack, including plans for notification and update if important security vulnerabilities in rack software are discovered. (F.5)
VI.13. A public document describes the expected state of all the GENI experimental resources in the rack, including how to determine the state of an experimental resource and what state is expected for an unallocated bare-metal node. (F.5)
VII.06. A site administrator can verify the control software and configurations on the rack at some point in time. (F.5)
VII.07. A site administrator can perform software and OS updates on the rack. (F.5)

Requirement:

F.6 Site support staff (and GENI operations) must be able to view source code for any software covered by the GENI Intellectual Property Agreement that runs on the rack. Rack teams should document the location of such source code in their public site documentation (e.g. on the GENI wiki).

Acceptance Criteria:

VI.11. A public document describes the GENI software running on the rack, and explains how to get access to the source code of each piece of GENI software. (F.6)
VII.08. A site administrator can get access to source code for the version of each piece of GENI code installed on their site rack at some point in time. (F.6)

Requirement:

F.7 "Sites may set policy for use of the rack resources, and must be able to manage the

OpenFlow and compute resources in the rack to implement that policy. The rack teams must identify each site's written GENI policy before installation, and implement functions that allow site support contacts to manage resources according to the site policies. Site policies must be docume

nted on the site aggregate information page on the GENI wiki."

Acceptance Criteria:

VI.12. A public document describes all the GENI experimental resources within the rack, and explains what policy options exist for each, including: how to configure rack nodes as bare-metal vs. VM server, what options exist for configuring automated approval of compute and network resourc e requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, whether it is possible to trust local users within the rack. (F.7)

Requirement:

F.8 "Rack teams must implement functions that verify proper installation and operation of

the rack resources with periodic network and compute resource health checks. These functions must succeed for at least 24 hours before a site takes over responsibility for a GENI rack aggregate."

Acceptance Criteria:

VII.15. Some rack internal health checks are run on a regular basis, and a site administrator can get emailed alerts when these checks fail. (F.8)
VII.16. A public document explains how to perform comprehensive health checks for a rack (or, if those health checks are being run automatically, how to view the current/recent results) . (F.8)

Experimenter Requirements (G)

Requirement:

G.1 Experimenters must have root/admin capability on VMs.

Acceptance Criteria:

I.21. An experimenter can reserve a VM on the rack, and can run any command with root privileges within the VM OS, including loading a kernel module in any VM OS with a modular kernel. (G.1)

Requirement:

G.2 Experimenters must be able to provision multiple (physical or virtual) data plane interfaces per experiment.

Acceptance Criteria:

I.22. An experimenter can create and access an experiment which configures a bare-metal compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)
I.23. An experimenter can create and access an experiment which configures a VM compute resource to have at least two logical dataplane interfaces with distinct MAC addresses. (G.2)

Requirement:

G.3 Experimenters must have direct layer 2 network access (receive and transmit), via Linux SOCK_RAW socket type or similar capability.

Acceptance Criteria:

I.24. An experimenter can create and access an experiment containing a compute resource with a dataplane interface, and can construct and send a non-IP ethernet packet over the dataplane interface. (G.3)

Requirement:

G.4 Rack must provide a mechanism to virtualize data plane network interfaces so that multiple experiments running on multiple VMs may simultaneously and independently use a single physical interface (e.g. by providing separate IP and MAC addresses for each VM) . [Example use case: A n experimenter wishes to test her non-IP protocol across thirty network nodes (implemented as five VMs in each of six GENI racks at six different physical sites). The network connections among the sites are virtualized into five separate networks (called VLANs here, but some other virtual ization approach is OK), VLAN1, …, VLAN5. Some VMs represent end hosts in the network. They will open connections to only one virtualized interface on a particular network, e.g. eth0.VLAN2. Other VMs represent routers and open multiple virtualized interfaces on multiple networks, e.g. eth 0.VLAN1, eth1.VLAN3, eth2.VLAN5. Packets transmitted on a particular VLANx are visible on all other virtual interfaces on VLANx, but they are not visible on virtual interfaces on different VLANs, nor are they visible to interfaces in any other experiment.]

Acceptance Criteria:

I.18. If multiple VMs use the same physical interface to carry dataplane traffic, each VM has a distinct MAC address for that interface. (C.2.d, G.4)
II.01. If multiple VMs use the same physical interface to carry dataplane traffic, traffic between the VM dataplane interfaces can be OpenFlow-controlled. (C.2.d) (G.4)

Acceptance Criteria Not Verified

This section provides a list of Acceptance Criteria that are not verified as part of the System Acceptance Test effort. The following Acceptance Criteria are "not part of" "or mapped to" not in any test case at this time:

  • I.19. An experimenter can request an unbound exclusive VLAN from a local aggregate manager in the rack, and dataplane traffic can be sent on that VLAN and seen on the campus's primary dataplane switch. (C.2.e) [Verified in InstaGENI test plan, not in ExoGENI test plan]
  • II.02. An experimenter can create a sliver including an unbound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
  • II.06. An experimenter can run a controller (for any type of OpenFlow sliver) on a compute resource which they're requesting at the same time as the OpenFlow sliver, by specifying in the request which compute resource they want to use. (The AM can define how unbound this can be, e.g. "one of the VMs", or "the VM with this client_id", or whatever.) (C.2.f)
  • I.08. Experimenters can create and access slivers within the rack containing at least 100 distinct dataplane VLANs (C.2.a)
  • I.09. Experimenters can create and access slivers on two racks which simultaneously use all available unbound exclusive VLANs which can connect those racks (C.2.a)
  • I.13. Two experimenters cannot create and access slivers containing a compute resource in the rack with a dataplane interface connected to the same bound exclusive VLAN. (C.2.b, C.2.e)
  • I.14. An experimenter can create and access a sliver containing a compute resource in the rack with dataplane interfaces connected to multiple bound VLANs. (C.2.b)
  • I.16. An experimenter can create and access a sliver containing multiple compute resources in the rack with dataplane interfaces on each connected to multiple bound VLANs. (C.2.b)
  • I.25. An experimenter can request a publically routable IP address or public TCP/UDP port mapping for the control interface of any compute resource in their sliver (subject to availability of IPs at the site), and can access their resource at that IP address and/or port from the commodity internet, and send traffic outbound from their resource to the internet. (F.1) [Verified in InstaGENI test plan, not in ExoGENI test plan]
  • I.26. An experimenter can request a sliver which creates a layer 2 path, between two dataplane interfaces on the rack switch which connect to non-rack resources (e.g. a bound or unbound VLAN between the core and a campus compute resource), without requesting any rack compute resources. (use case 4)
  • I.27. If the rack's control network cannot reach the control network at the rack vendor's home site but is working otherwise, an experimenter can create and access an experimental sliver containing a compute resource and a VLAN (assuming the experimenter's slice authority is reachable). (E)
  • I.28. An experimenter can create an experimental sliver containing a compute resource and a VLAN, and can verify that the sliver is continuously accessible for one week. (E)
  • II.03. An experimenter can create a sliver including a bound exclusive VLAN whose traffic is OpenFlow-controlled by a controller that they run. (C.2.f)
  • IV.03. When changes to the Emergency Stop procedure are approved at a GEC, rack vendors implement any needed AM software and site administrator documentation modifications which are needed to support these changes within 3 months.
  • IV.04. When all aggregates and services are running on the primary rack server, the host's performance is good enough that OpenFlow monitoring does not lose data (due to an overloaded FOAM or FlowVisor) and does not report visible dataplane problems (due to an overloaded FlowVisor). (open questions ExoGENI B-6, InstaGENI B.4)
  • VI.01. The rack development team has documented a technical plan for handing off primary rack operations to site operators. (E)
  • VI.02. A public document contains a parts list for each rack. (F.1)
  • VI.03. A public document states the detailed power requirements of the rack, including how many PDUs are shipped with the rack, how many of the PDUs are required to power the minimal set of shipped equipment, the part numbers of the PDUs, and the NEMA input connector type needed by each PDU. (F.1)
  • VI.04. A public document states the physical network connectivity requirements between the rack and the site network, including number, allowable bandwidth range, and allowed type of physical connectors, for each of the control and dataplane networks. (F.1)
  • VI.05. A public document states the minimal public IP requirements for the rack, including: number of distinct IP ranges and size of each range, hostname to IP mappings which should be placed in site DNS, whether the last-hop routers for public IP ranges subnets sit within the rack or elsewhere on the site, and what firewall configuration is desired for the control network. (F.1)
  • VI.06. A public document states the dataplane network requirements and procedures for a rack, including necessary core backbone connectivity and documentation, any switch configuration options needed for compatibility with the L2 core, and the procedure for connecting non-rack-controlled VLANs and resources to the rack dataplane. (F.1)
  • VI.12. A public document describes all the GENI experimental resources within the rack, and explains what policy options exist for each, including: how to configure rack nodes as bare-metal vs. VM server, what options exist for configuring automated approval of compute and network resource requests and how to set them, how to configure rack aggregates to trust additional GENI slice authorities, whether it is possible to trust local users within the rack. (F.7)
  • V.07. If site operator privileges are defined, each type of operator access is documented, and the documented restrictions are testable. (C.3.a)
  • V.08. If experimenters have access to any rack infrastructure, the access is restricted to protect the infrastructure. The restrictions are documented, and the documented restrictions are testable. (C.3.a)
  • VI.08. A public document explains how to use the GMOC ticket system to report rack changes or problems, and under what circumstances a GMOC ticket needs to be opened. (F.4)
  • VII.05. The rack development team and GPO are able to use the GMOC ticket system to communicate with each other, and provide feedback to GMOC about any issues. (F.4)
  • VII.15. Some rack internal health checks are run on a regular basis, and a site administrator can get emailed alerts when these checks fail. (F.8)
  • VII.17. A site administrator can get information about the power utilization of rack PDUs. (D.6.c)

Glossary/Definitions

Following is a glossary for terminology used in this plan, for additional terminology definition see the GENI Glossary page.

  • Local Broker - An ORCA Broker provides the coordinating function needed to create slices. The rack's ORCA AM delegates a portion of the local resources to one or more brokers. Each rack has an ORCA AM that delegates resources to a local broker (for coordinating intra-­‐rack resource allocations of compute resources and VLANs) and to the global broker.

*ORCA Actors - ExoGENI Site Authorities and Brokers which can communicate with each other. An actor requires ExoGENI Operations staff approval in order to start communications with other actors.

  • ORCA Actor Registry - A secure service that allows distributed ExoGENI ORCA Actors to recognize each other and create security associations in order for them to communicate. Runs at Actor Registry web page. All active ORCA Actors are listed in this page.
  • ORCA Aggregate Manager (AM) - An ORCA resource provider that handles requests for resources via the ORCA SM and coordinates brokers to delegate resources. The ORCA Aggregate Manager is not the same as the GENI Aggregate Manager.
  • Site or Rack Service Manager (SM) - Exposes the ExoGENI Rack GENI AM API interface and the native XMLRPC interface to handles experimenter resource requests. The Site SM receives requests from brokers (tickets) and redeems tickets with the ORCA AM. All Acceptance tests defined in this plan interact with a Service Manager (Site SM or the global ExoSM) using the GENI AM API interface via the omni tool.
  • ExoSM - A global ExoGENI Service Manager that provides access to resources from multiple ExoGENI racks and intermediate network providers. The ExoSM supports GENI AM API interactions.
  • ORCA RSpec/NDL conversion service - A service running RENCI which is used by all ORCA SMs to conver RSPEC requests to NDL and NDL Manifests to RSpec.
  • People:
    • Experimenter: A person accessing the rack using a GENI credential and the GENI AM API.
    • Administrator: A person who has fully-privileged access to, and responsibility for, the rack infrastructure (servers, network devices, etc) at a given location.
    • Operator: A person who has unprivileged/partially-privileged access to the rack infrastructure at a given location, and has responsibility for one or a few particular functions.
  • Baseline Monitoring: Set of monitoring functions which show aggregate health for VMs and switches and their interface status, traffic counts for interfaces and VLANs. Includes resource availability and utilization.
  • Experimental compute resources:
    • VM: An experimental compute resource which is a virtual machine located on a physical machine in the rack.
    • Bare-metal Node: An experimental exclusive compute resource which is a physical machine usable by experimenters without virtualization.
    • Compute Resource: Either a VM or a bare-metal node.
  • Experimental compute resource components:
    • logical interface: A network interface seen by a compute resource (e.g. a distinct listing in ifconfig output). May be provided by a physical interface, or by virtualization of an interface.
  • Experimental network resources:
    • VLAN: A dataplane VLAN, which may or may not be OpenFlow-controlled.
    • Bound VLAN: A VLAN which an experimenter requests by specifying the desired VLAN ID. (If the aggregate is unable to provide access to that numbered VLAN or to another VLAN which is bridged to the numbered VLAN, the experimenter's request will fail.)
    • Unbound VLAN: A VLAN which an experimenter requests without specifying a VLAN ID. (The aggregate may provide any available VLAN to the experimenter.)
    • Exclusive VLAN: A VLAN which is provided for the exclusive use of one experimenter.
    • Shared VLAN: A VLAN which is shared among multiple experimenters.

We make the following assumptions about experimental network resources:

  • Unbound VLANs are always exclusive.
  • Bound VLANs may be either exclusive or shared, and this is determined on a per-VLAN basis and configured by operators.
  • Shared VLANs are always OpenFlow-controlled, with OpenFlow providing the slicing between experimenters who have access to the VLAN.
  • If a VLAN provides an end-to-end path between multiple aggregates or organizations, it is considered "shared" if it is shared anywhere along its length --- even if only one experimenter can access the VLAN at some particular aggregate or organization (for whatever reason), a VLAN which is shared anywhere along its L2 path is called "shared".


Email help@geni.net for GENI support or email me with feedback on this page!