wiki:PlasticSlices/BaselineEvaluation

Version 21 (modified by Josh Smift, 13 years ago) (diff)

--

Plastic Slices Baseline Plan

The GENI Plastic Slices baseline evaluations capture progress throughout the Plastic Slices project. Each planned baseline will capture information in the following major areas:

  • Environment: System configuration, software versions, and resources for compute, OF, VLAN, and Monitoring.
  • Evaluation details: Detailed description of the Topology, Traffic Profiles and tools used in the evaluation.
  • Evaluation criteria: Detailed description of the Criteria used to determine success along with any assumption made in defining success.
  • Baseline Evaluation Report (BER): Report capturing all results and analysis for each Baseline and an overall cumulative final version.

In addition to the above areas, the GPO will also actively conduct project level coordination that will include requirements tracking, experiment design, GMOC monitoring data reviews. GPO will also provide campuses support for OpenFlow resource, MyPLC resources, and campus GMOC monitoring data feeds. Additionally, the GPO will provide and support for a GENI AM API compliant ProtoGENI aggregate with four hosts.

Baseline Areas

This section captures the implementation details for all major areas in each of the baselines planned.

Environment

Capturing the definition of the environment used to validate the baselines scenarios is a crucial part of this activity. A complete definition will be captured to facilitate the repeatability of the baseline scenario in the Quality Assurance step to be performed by Georgia Tech. Environment captures will detail:

  • Configuration for all compute resources:
    • Prerequisite environment changes (ex. ports to be opened in FW).
    • software versions
    • configuration settings (ports)
    • OS versions, hardware plaforms
  • Configuration for all OF and VLAN resources:
    • Firmaware versions
    • Hardware platform, device models.
    • FlowVisor and Expedient Controllers (OF)
  • Configuration for all Management and Monitoring resources
    • MyPLC (OS, SW version, Settings)
    • GMOC API version, Data monitoring Host,
    • GMOC STOP procedure definition.
    • NCSA security plan versions

Evaluation details

  • Definition of each entity in test and of its role.
  • Detailed topology of each test.
  • Tools used for the evaluation (traffic generation, site monitoring, test scripts?)
  • Traffic Profiles (iperf?, traffic type, packet rates, packet sizes, duration, etc)
  • Result gathering (collect logs(OF,expedient,MyPLC), collect monitoring statistics, collect traffic generation results, other?)

Evaluation criteria

Each baseline run will be evaluated to determine success, the following will be considered:

  • Traffic sent was successfully received.
  • All nodes are up and available through out the baseline time frame.
  • Logs and statistics are in line with the expected results and no FATAL or CRITICAL failure are found. Lesser priority issues, such as WARNINGS are acceptable as long as they can be shown no to impact ability to communicate between endpoints.
  • Runtime assumptions can be made as long as reviewed and deemed reasonable. For example, assuming that ?? (need relevant example).

Baseline Evaluation Report

As each baseline is completed, a Baseline Evaluation Report is generated that captures the results of the baseline, an analysis of the results, and any impact that the findings may have on the current Plastic Slices Plan.

Baseline evaluation reports will also capture progress for requirements being tracked as well as progress towards the overall goals. A final version of the BER is generated to captures a cumulative representation of all Plastic Slices baseline assessments.

Question: Does it make sense to report by major areas? Ex. OF, Compute resources,Monitoring, etc?

Baseline Detailed Plans

This section provides a detailed description for each baseline and their respective Evaluation details (topology, tools, traffic profiles, etc), Evaluation Criteria, and reporting.

Baseline 1

Due 2011-05-16
Completed 2011-05-19
Summary Descr Ten slices, each moving at least 1 GB of data per day, for 24 hours.

Detailed Description:

Cause the experiment running in each slice to move at least 1 GB of data over the course of a 24-hour period. Multiple slices should be moving data simultaneously, but it can be slow, or bursty, as long as it reaches 1 GB total over the course of the day.

The purpose of this baseline is to confirm basic functionality of the experiments, and stability of the aggregates.

Summary of overall results:

  • All slices had at least one client/server pair complete successfully, with results consistent with what we'd expect.
  • Most slicess had all client/server pairs complete successfully.

For more details, see the Baseline 1 Details page. For an overview of slices, participants and experiments, see the Baseline 1 Slice Status.

Baseline 2

Due 2011-05-23
Completed
Summary Descr Ten slices, each moving at least 1 GB of data per day, for 72 hours.

Detailed Description:

Similar to the previous baseline, cause the experiment running in each slice to move at least 1 GB of data per day, but do so repeatedly for 72 hours.

The purpose of this baseline is to confirm longer-term stability of the aggregates.

Summary of overall results:

  • The results from most slices were as expected, with just a few oddities here and there.
  • We captured detailed logs of the entire session on each plnode in each experiment.

For more details, see the Baseline 2 Details page. For an overview of slices, participants and experiments, see the Baseline 2 Slice Status.

Baseline 3

Due 2011-05-31
Completed
Summary Descr Ten slices, each moving at least 1 GB of data per day, for 144 hours.

Detailed descritpion:

Similar to the previous baseline, cause the experiment running in each slice to move at least 1 GB of data per day, but do so repeatedly for 144 hours. The purpose of this baseline is to confirm even longer-term stability of the aggregates.

Summary of overall results:

  • This baseline included multiple rounds of experiments over the six days.
  • The first round was very rocky due to network instability.
  • We weren't able to run a round of experiments on the second day, due to network instability.
  • The remaining rounds ran very smoothly, generally producing results as expected.

For more details, see the Baseline 3 Details page. For an overview of slices, participants and experiments, see the Baseline 3 Slice Status.

Baseline 4

Due 2011-06-03
Completed
Summary Descr Ten slices, each moving at least 1 Mb/s continuously, for 24 hours.

Detailed Description:

Cause the experiment running in each slice to move at least 1 MB/second continuously over the course of a 24-hour period (approximately 10 GB total).

The purpose of this baseline is to confirm that an experiment can send data continuously without interruption.

Summary of overall results:

  • We switched to a different logging method for this baseline, which had some advantages but proved to be too problematic in other ways.
  • We lost connections to some systems during the longer run.
  • We wrapped up after 21 hours, but gathered a good amount of data in that time.

For more details, see the Baseline 4 Details page. For an overview of slices, participants and experiments, see the Baseline 3 Slice Status.

Baseline 5

Due 2011-06-07
Completed
Summary Descr Ten slices, each moving at least 10 Mb/s continuously, for 24 hours.

Detailed Description:

Similar to the previous baseline, cause the experiment running in each slice to move at least 100 MB/second continuously over the course of a 24-hour period (approximately 1 TB total).

The purpose of this baseline is to confirm that an experiment can send a higher volume of data continuously without interruption.

Summary of overall results:

  • We switched back to a variation of an earlier logging method, which worked better.
  • We switched to running 'screen' on each MyPLC plnode, to deal with the problem of losing connections to the plnodes.
  • We accidentally failed to actually send 10 Mb/s on the ping and UDP slices.
  • Other things went smoothly, providing a good basis for Baseline 6.

For more details, see the Baseline 5 Details page. For an overview of slices, participants and experiments, see the Baseline 5 Slice Status.

Baseline 6

Due 2011-06-13
Completed
Summary Descr Ten slices, each moving at least 10 Mb/s continuously, for 144 hours.

Detailed Description:

Similar to the previous baseline, cause the experiment running in each slice to move at least 10 MB/second continuously over the course of a 144-hour period.

The purpose of this baseline is to confirm that an experiment can send a higher volume of data continuously without interruption, for several days running.

Summary of overall results:

  • We continued to use the same logging method, which didn't work as well, perhaps because of the huge size of the log files.
  • All but one of the slices was able to send 10 Mb/s when things were going smoothly.
  • A variety of outages occurred during this baseline, so things often weren't going smoothly.
  • Despite these challenges, many slices successfully transfered data continuously for most of the duration of the bsaeline.

For more details, see the Baseline 6 Details page. For an overview of slices, participants and experiments, see the Baseline 6 Slice Status.

Baseline 7

Due 2011-06-20
Completed
Summary Descr Perform an Emergency Stop test while running ten slices,each moving at least 10 Mb/s continuously, for 144 hours.

Detailed Description:

Repeat the previous baseline, but call an Emergency Stop while it's running, once per slice for each of the ten slices. Campuses will not be informed in advance about when each Emergency Stop will be called. There will be at least one instance of two simultaneous Emergency Stops, and at least one instance of a single campus being asked to respond to two simultaneous Emergency Stops. After each Emergency Stop, verify that all resources have been successfully restored to service.

GMOC will define precisely how each Emergency Stop test will be conducted, and what resources will be stopped, presumably selecting a combination campus resources (e.g. disconnecting an on-campus network connection) and backbone resources (e.g. disabling a campus's connection to an inter-campus VLAN).

The purpose of this baseline is to test Emergency Stop procedures.

Summary of overall results:

  • We conducted two Emergency Stop tests, both involving resources at BBN.
  • We accidentally left the experiments running after the Emergency Stop tests completed, which led to some odd results in the raw logs.
  • We postponed the tests involving other sites, to incorporate lessons learned during these teasts.

For more details, see the Baseline 7 Details page. For an overview of slices, participants and experiments, see the Baseline 7 Slice Status.

Baseline 8

Due 2011-06-20
Completed
Summary Descr Create one slice per second for 1000 seconds; then create and delete one slice per second for 24 hours.

Detailed Description:

This baseline creates new temporary slices, rather than using the existing ten slices, creating a thousand slices at the rate of one slice per second, and then continuing to delete and create a new slice every second, for 24 hours. Each slice will include resources at three campuses, selected randomly for each slice. Automated tools will confirm that the resources are available, e.g. by logging in to a host and running 'uname -a'.

The purpose of this baseline is to confirm that many users can create slices and allocate resources at the same time.

Summary of overall results:

  • We began by simply creating 10, 100, and 1000 slices in the pgeni.gpolab.bbn.com Slice Authority, without any resources (i.e. without creating any slivers).
  • The first test worked fine, but 100 slices (at one per second) started to overwhelm the resources of the system providing the SA service, and 1000 slices completely overwhelmed it.
  • We postponed the portions of this baseline that involved creating slivers.

For more details, see the Baseline 8 Details page.