wiki:InstMeasTopic_4.3UseCasesInfrastructure

Version 4 (modified by hmussman@bbn.com, 12 years ago) (diff)

--

4.3) I&M Use Cases for Infrastructure Measurement, and Support for Operators

1) Goals

From Sec. 2 of the GENI I&M Architecture document:

In addition, the GENI operations staff require extensive and reliable instrumentation and measurement capabilities to monitor and troubleshoot the GENI suite and its constituent entities. Some of this data will be made available to experimenters, to help them conduct useful and repeatable experiments.

The GMOC, providing GENI-wide operator services, needs to monitor essentially all GENI infrastructure on a 24x7 basis. In this case, the GMOC Operator will gather, analyze and present MD that monitors hundreds of infrastructure elements.

2) Tasks

Provide a concise but complete definition of I&M Use Cases for Infrastructure Measurement

Identify the support that should be available to operators

Update the GENI I&M Architecture document:

Sec. 3.3. I&M Use cases for Central Operators (i.e., GMOC)
Sec. 3.4. I&M Use cases for Aggregate Providers and Operators
Sec. 4.2.2 Typical Arrangements of I&M Services: For Operator Gathering MD from GENI Infrastructure
Sec. 4.2.3 Typical Arrangements of I&M Services: For Experimenters Gathering MD from their Slice and from GENI Infrastructure
Sec. 4.3.3 Type 3 I&M Service: Common Service with MD for Multiple Slices

Use as guidance in the design of GENI I&M tools, particularly for the GEMINI and GIMI projects

3) Team

LEAD Martin Swany (Indiana U)
Guilherme Fernandes (?)
Eric Boyd (Internet2)
Jason Zurawski (Internet2)
Prasad Calyam (Ohio Super Center)
Chris Small, for NetKarma (Indiana U)
Ilia Baldine, for ExoGENI racks (RENCI)
Jonathan Mills (RENCI)
?, for InstaGENI racks (HP)
?, for GMOC
Sarah Edwards (GPO)
Chaos Golubitski (GPO)
Harry Mussman (GPO)

4) Meetings

(organized calls or meetings before GEC13?)

Review conclusions in pre-meeting at GEC13

Review with working team at GEC13

Review with operators, monitoring team at GEC13

5) Issues

1) Identify which passive monitoring options are to be supported 2) Identify which event monitoring options are to be supported 3) Identify which active measurement options are to be supported

6) Definition

Definition of infrastructure monitoring:

1) Passive monitoring of clusters/racks, including transport switches, etc.
2) Event monitoring, provides log entries
3) Active measurements of IP networks, of Layer 2 and OpenFlow paths

4.2.7 Passive Monitoring Options

1) Aggregate operator establishes MP to gather MD via SNMP, organizes into time-series data, and formulates MDOD

1b) Directly from cluster/switch/etc.
1c) Via Ganglia
1d Via Nagios
1e) Via Cacti

2) MD sinks:

2b) Local aggregate operator
2c) GMOC (when authorized)
2d) Experimenter (when authorized)

3) MD format and interface:

3b) Time-series data, presented at perfSONAR MA, MDOD registered at global UNIS, can be pulled by authorized user, and presented using perfSONAR service
3c) Time-series data, pushed using OML protocol, to OML server, and presented using GIMI service
3d) Time-series data, pushed using GMOC protocol, to GMOC server, and presented using ? service
3e) Time-series data, published to XML messaging service, can be subscribed by authorized user, and presented using ? service

4.2.8 Event Monitoring Options

1) Aggregate operator establishes MP to issue Event Records (ERs)

2) ER sinks:

2b) Local aggregate operator
2c) GMOC (when authorized)
2d) Clearinghouse (when authorized)
2e) Experimenter (when authorized)

3) ER format and interface:

3b) Follows XML format defined by NetKarma, adapted from MDOD
3c) Published to XML messaging service, can be subscribed by authorized user, logged using ? service, presented using ? service

4.2.9 Active Measurement Options

1) Owner:

1b) Aggregate operator
1c) GMOC
1d) Experimenter

2) Owner establishes slice, includes active measurements, and formulates MDOD

2b) Persistent
2c) On-demand

3) MD sinks:

3b) Owner
3c) Aggregate operator (when authorized)
3d) GMOC (when authorized)
3e) Experimenter (when authorized)

4) MD format and interface:

4b) Time-series data, presented at perfSONAR MA, MDOD registered at global UNIS, can be pulled by authorized user, and presented using perfSONAR service
4c) Time-series data, pushed using OML protocol, to OML server, and presented using GIMI service

5) Active measurements:

5b) For IP networks, i.e., ping and iperf
5c) Specialized for L2 networks
5d) Specialized for OF networks

4.2.10 Active Measurement Process

Baseline infrastructure measurement process:

1) Setup persistent or on-demand infrastructure measurement slice.
2) Make passive measurements or make active measurements.
3) Gather MD, and observe as it is gathered; formulate MDOD.
4) Store MD in collector, describe with MDOD, and register MDOD so that MD can be shared.
5) Typically share MD with Aggregate Operator, GMOC and/or Experimenters, per policy written into MDOD.
6) Pull MD out of collector, analyze and visualize.
7) Archive MD with MDOD.
8) Share archived MD with others, per policy included within MDOD.
9) Pull MD out of archive, to analyze and/or visualize.

4.2.11 Support for Operators

What support must be provided for Operator? how?