wiki:2ndInstMeasWork

Version 44 (modified by fahmy@cs.purdue.edu, 9 years ago) (diff)

--

2nd GENI Instrumentation and Measurement Workshop

Tuesday, June 8, 1pm - Wednesday, June 9, 2pm
Chicago O'Hare Hilton
NOTE: By invitation only

Announcement with goals, topics and reference material

The following announcement (with figures and references) was sent to the attendees to prepare for the workshop:

Announcement, v1.1 (060410)
Figures, v1.1 (060410)

Attendees at workshop

(Attended workshop: yes or no)

Paul Barford - University of Wisconsin – Madison (no)
Bruce Maggs – Duke University and Akamai (yes)
Harry Mussman – BBN/GPO (yes)
Vic Thomas - BBN/GPO (yes)
Evan Zhang – BBN/GPO (yes)

OML (ORBIT Measurement Library) OMF (ORBIT Management Framework)

Max Ott – NICTA (yes, by phone)
Ivan Seskar – Rutgers WINLAB (yes)

Instrumentation Tools

Jim Griffioen - Univ Kentucky (yes)

perfSONAR

Matt Zekauskas - Internet2 (no)
Jason Zurawski – Internet2 (yes)
Martin Swany - Univ Delaware (yes)
Guilherme Fernandes – Univ Delaware (yes)
Ezra Kissel – Univ Delaware (yes)

Scalable Sensing Service (S3)

Sonia Fahmy – Purdue (yes)
Puneet Sharma - HP Labs (yes)

OnTimeMeasure for network measurements

Prasad Calyam - Ohio Supercomputing Ctr (yes)

GENI Meta-Operations Center and NetKarma

Jon-Paul Herron - Indiana Univ
Camilo Viecco - Indiana Univ (yes)
Chris Small - Indiana Univ (yes)
Beth Plale - Indiana Univ (no)

Virtual Machine Introspection (VMI)

Brian Hay – Univ Alaska (yes)

Data-Intensive Cloud Control for GENI

Michael Zink - UMass Amherst (yes)

Experiment Management Service – Digital Object Registry

Jim French - CNRI (yes)
Giridhar Manepalli - CNRI (yes)
Larry Lannom – CNRI (no)

Announcement with notes

The following announcement now includes all notes from the workshop:

Announcement, v1.2 (062510) word version
Figures, v1.2 (062510)

Priority topics

The following priority topics were identified at the workshop, and teams of attendees (and some non-attendees0 were identified for each topic to discuss it, and write a text summary for review by the WG at GEC8.

Topic 1 GENI I&M Use Cases

Team members:

Paul Barford - University of Wisconsin – Madison (no)
Jim Griffioen - Univ Kentucky (yes)
Prasad Calyam* - Ohio Supercomputing Ctr (yes)
Camilo Viecco - Indiana Univ (yes)
Brian Hay – Univ Alaska (yes)

*agreed to organize first writing and discussion

Identify all user groups, and provide basic use cases:

  1. Experiment researchers
  2. Experiment (opt-in) users (see http://groups.geni.net/geni/attachment/wiki/041409NYCOptInWGAgenda/071509%20%20GENI-SE-OI-Overview-01.4.pdf for listing of opt-in issues, such as privacy)
  3. Central (i.e., GMOC) operators
  4. Cluster and aggregate providers and operators
  5. Archive service providers and operators
  6. Researchers that use archived measurement data, archived by other researchers (DatCat model)

Topic 2 GENI I&M Services

Team members:

Harry Mussman* – BBN/GPO (yes)
Evan Zhang – BBN/GPO
Giridhar Manepalli - CNRI (yes)
Chris Small - Indiana Univ (yes)
Beth Plale - Indiana Univ (yes)

*agreed to organize first writing and discussion

Summarize current view of:

Measurement Orchestration (MO) Service
Measurement Point (MP) Service
Measurement Information (MI) Service
Measurement Collection (MC) Service
Measurement Analysis and Presentation (MAP) Service
Measurement Data Archive (MDA) Service

Need: Basic definition of a Measurement Data Archive (MDA) Service

Identify different types of services:

Type 1: Dedicated service platform, with dedicated sliver, for customized information.
(Completely dedicated to an experiment)
Type 2: Common service platform, with dedicated slivers, for customized information.
(Common portion, plus parts associated with different experiments)
Type 3: Common service, for common or customized information.
(Common service, with data provided to multiple experiments)

Topic 3 GENI I&M Resources

Team members:

Vic Thomas - BBN/GPO (yes)
Jim Griffioen* - Univ Kentucky (yes)
Martin Swany - Univ Delaware (yes)
Camilo Viecco - Indiana Univ (yes)
Brian Hay – Univ Alaska (yes)
Giridhar Manepalli - CNRI (yes)

*agreed to organize first writing and discussion

Significant question uncovered at workshop:
Jim on 6/25 via email: We should involve Rob Ricci in the discussion.

Consider these resources for I&M capabilities:

  1. Hosts, VMs, etc.
  2. Network connectivity
  3. Software, e.g., I&M software that can be included in an experiment
  4. I&M services
  5. I&M data flows and file transfers
  6. I&M data files stored in archives

For each item, consider how to:

Create
Name
Register and discover
Authorize and assign

Does each item have:

Unique and persistent name?
Unique and persistent identifier?
Need to carefully consider this for all of GENI

For each item, consider:

Ownership
What sort of policies the owner may want to apply

How are each of these discovered, specified, authorized and assigned:

Always by mechanisms provided by the CF? With CF plus additional mechanisms?

Consider example of LS in perfSONAR
Consider example of data file stored in archive, owned by an experimenter Need to define and then compare these options
Need to understand interop with CF for each option
Does CF setup secondary authorization mechanisms in some cases? If so, how?

Topic 4 GENI I&M Measurement Plane and Interfaces

Team members:

Harry Mussman* – BBN/GPO (yes)
Ezra Kissel – Univ Delaware (yes)
Chris Small - Indiana Univ (yes)

*agreed to organize first writing and discussion

Consider:

IP network
Layer 2 (VLAN) connections

Discuss:

Which protocols are active
Access to resources in aggregates, even when resources are in private address space, via GWs or proxies
How to provide authentication and authorization
How to provide QoS to protect measurement traffic
How to provide QoS to protect other traffic when measurement traffic is large.
Reserve bandwidth?

Martin on 6/28: Consider XSP (extensible session protocol) to provide transport layer GW functions.

Topic 5 GENI I&M Interfaces and Protocols (APIs): Manage Services

Team members:

Vic Thomas - BBN/GPO (yes)
Ivan Seskar – Rutgers WINLAB (yes)
Max Ott – NICTA (yes, by phone)
Sonia Fahmy* – Purdue (yes)
Giridhar Manepalli - CNRI (yes)

*agreed to organize first discussion and writing

Most instrumentation services follow the following steps: (1) request, (2) measure, (3) process, (4) transport, and (5) retrieve. A user of the instrumentation service will request certain measurements, which results in the provisioning and configuration of measurement probes. The probes generate a set or a stream of measurement data which may be further processed and potentially transported to a different location. Finally, the results of the requested measurements are stored in one or more repositories from which they can be retrieved and/or queried by authorized users, tools and services, either immediately (real-time) or at a later time (history).

Depending on usage scenarios and objectives, several different architectures for instrumentation and measurement can emerge. Current GENI instrumentation projects leverage a number of popular tools to construct their service. For example, the Scalable Sensing Service project (S3) implements "sensor pods" (measurement points) as cgi scripts accessible through any web-server that supports cgi. Boa, a light-weight open source web-server, is currently used. Therefore, starting/stopping the sensor pods is simply accomplished by starting and stopping the server. This framework enables convenient third party measurements; that is, measurements between two nodes can be initiated by a third node. Periodic measurements are configured as "cron" jobs on the sensor pods. A sensing information manager ensures the liveness of the service using vxargs. The sensing information manager also collects the measurements from the sensor pods using rsync, and stores them in a data store (S3).

In contrast, OML (OMLTridentCom,OMLwiki) takes a distributed stream-based approach. Probes are assumed to produce a stream of measurement tuples which are streamed through a configurable set of filters and caches to a repository. The repository, like S3, provides a web service interface allowing users and other services to retrieve the results (pull mode). However, users/services can also directly subscribe to a stream for operation in push mode. OML is integrated into the OMF control framework to simplify configuration of the entire distributed setup, as well as provide to support for "steerable" experiments, where the orchestration of an experiment can be influenced by the simultaneously collected measurements.

Another instrumentation service currently in GENI is perfSONAR, which sits somewhere in the middle. Various distributed monitoring services are "wrapped" into individual web services called "Measurement Points" (MPs) which return their results using the Open Grid Forum's Network Measurement Working Group (NMWG) XML schema. A set of other services, such as the Measurement Archive, Lookup, Authentication, Topology, Transformation, and Resource Protector provide a comprehensive suite of services. perfSONAR is migrating to Representational State Transfer (REST), viewed as a lighter-weight service-oriented architecture over existing web service technologies, e.g., SOAP. XML-RPC and SNMP are also leveraged in several current measurement projects.

APIs

There are essentially two types of measurement APIs: one concerning itself with the configuration of the various elements and services participating in the measurement and monitoring activities, and the other facilitating the transfer of the measurement data itself.

The provisioning and configuration of instrumentation functionality in a large federated environment serving many experimenters and spanning many countries with their respective legal frameworks is clearly a complex undertaking. However, at a fundamental level, instrumentation requires discovering, provisioning, and configuring resources, within the constraints of policies and user privileges. This is essentially functionality that the GENI control framework (CF) is (or will be) providing. What is not covered by the CF is the protocol for encoding and transporting measurement data between entities. Here, we must differentiate between two types of operations: the push or streaming, and the pull or requesting mode of operation. Both are not unique to this domain and there are several standards and widely used solutions.

OML will soon be using the Internet Protocol Flow Information Export (IPFIX) protocol standardized by the IETF for streaming measurements from an exporter to a collector. perfSONAR, as discussed above, is using SOAP and the Open Grid Forum's Network Measurement Working Group (NMWG) XML schema for returning measurements back to a requester. In OML, once an application has been modified to integrate OML measurements, two possible configurations are possible: passing general arguments on the command line of the enhanced application, or using an XML configuration file. The XML configuration file allows the user to multiplex the measurement stream and apply different filters inside the application and between Measurement Points (OMLTridentCom,OMLwiki).

References:

(S3) P. Yalagandula, P. Sharma, S. Banerjee, S. Basu, and S.-J Lee. S3: A scalable sensing service for monitoring large networked systems. In Proceedings of the ACM SIGCOMM Workshop on Internet Network Management (INM), Sept. 2006.

(OMLTridentCom) J. White, G. Jourjon, T. Rakatoarivelo, M. Ott. Measurement Architectures for Network Experiments with Disconnected Mobile Nodes. In Proceedings of TridentCom, 2009.

(OMLwiki) http://omf.mytestbed.net/wiki/oml/Documentation

Topic 6 GENI I&M Interfaces and Protocols (APIs): Data Flows and Data File Transfers

Team members:

Harry Mussman* – BBN/GPO (yes)
Ivan Seskar – Rutgers WINLAB (yes)
Max Ott – NICTA (yes, by phone)
Ezra Kissel – Univ Delaware (yes)
Prasad Calyam - Ohio Supercomputing Ctr (yes)
Michael Zink - UMass Amherst (yes)

*agreed to organize first writing and discussion

Consider data flows and data file transfers between all services

Define range of options:

What:

Data flows
Data files transfers

Type:

Pull
Push
Pub/Sub

Protocol:

SNMP
SCP
FTP and gridFTP
HTTP
XMPP
TCP
SCTP

Consider:

Naming
Discovery
Connectivity
Authentication and authorization mechanisms

Map to current projects, giving examples:
Consider: Minimum set required for GENI

Topic 7 GENI I&M Interfaces and Protocols (APIs): Registration and Discovery of Services with Available Measurement Data

Team members:

Jason Zurawski* – Internet2 (yes)
Prasad Calyam - Ohio Supercomputing Ctr (yes)
*agreed to organize first writing and discussion

Consider approach used in perfSONAR

Summarize for:

Services with data flows
Also sources of file transfers?
Also GUIs?

Topic 8 GENI I&M Interfaces and Protocols (APIs): GUIs

Team members:

Jeremy Reed - Univ Kentucky (no)
Zongming Fei - Univ Kentucky (no)
Guilherme Fernandes* – Univ Delaware (yes)
Puneet Sharma - HP Labs (yes)
*Agreed to organize team

This section will lay out requirements and recommendations for GUIs that allow GENI users to manage (control, configure, access), visualize and discover I&M services, including the GUI themselves, and related data. The portals used by GENI experimenters to request the deployment of an I&M system or the instantiation of a sliver in an I&M component are not covered in the present draft. Requirements are defined to ensure GUIs and other services adhere to general GENI principles that must be enforced (e.g. privacy), and will tend to evolve from recommendations deemed essential by GENI I&M community. On the other hand, recommendations try to capture best practices in the field and general principles to increase the usability and effectiveness of these GUIs.

Note: This draft does not currently define any requirements or recommendations, but rather describes current practices, possible use cases and general thoughts that can hopefully inform the future definition of these requirements and recommendations by the GENI I&M WG.

Many design principles must be taken into consideration when developing GUIs for I&M architectures. As usual in Computer Science, these principles can be in sharp contrast with each other and trade-offs are made depending on the overall objective of the GUI. Following the general methodology of GENI, we identify GUIs developed for other I&M frameworks and services in order to capitalize on their strengths. A discussion of some of these design principles follows.

  • Centralized/Remote vs Distributed/Local vs Hybrid - These principles can be seen as part of a spectrum, with GUIs that provide a visual interface to services and data residing locally on the same machine on one end (e.g. pS-Performance Toolkit web interface), and GUIs that centralize management and distributed access to remote data and services on the other (e.g. MRTG+Cacti, CACTISonar). A centralized interface is the preferred approach on many use cases, such as health and status monitoring, as it increases the effectiveness and facilitates the access by providing a single, unified location for network management. However, collecting the data on a centralized location tends to increase the network overhead generated by network management. In contrast, GUIs with local scope tend to be faster in accessing data, create little or no network overhead and are generally simpler to build. Between these two extremes we can find hybrid GUIs. For example, a centralized visualization interface might cache only common queries and request others on demand (e.g. Periscope).
  • Flexibility vs Usability - GENI users should be able to have as much control as possible over the configuration of I&M services dedicated to their slice. In several instances it might be hard (or impossible) to identify a single set of parameters that satisfies all experimenters. GUIs can be designed to provide great flexibility (e.g. by permitting users to select what is to be displayed, how it should be displayed, or providing an interface to all sorts of configuration parameters). Expert users certainly appreciate having great flexibility, but even experts expect sensible defaults to be defined. On the other hand, non-expert users might be overwhelmed with too many options, reducing the GUI’s usability and users’ overall experience. Having different views (e.g. normal and advanced) with varying levels of complexity can be a conciliatory bridge between flexibility and usability.
  • Common graphical layouts and visualization aids - [There are established ways of presenting some types of data (e.g. utilization as line/area graphs with out and in directions overlayed, one- and two-way latency tests as scatter plots, measurement meshes results in 2D matrices). GUI developers should recognize de facto standards and try to follow them. Visualization aids are commonly provided through geographical maps, network topology diagrams, etc.]
  • Documentation?

The GENI I&M Architecture draft includes a Measurement Analysis and Presentation (MAP) Service. Many of the GUIs discussed in this section fall into this category. In contrast, network monitoring frameworks that take a middleware approach (e.g. perfSONAR) tend to view this type of service as users of the framework, sitting in a higher (external) layer. There are clear benefits of including this type of service in the architecture definition (e.g. it enables the main purpose of this section, namely to identify requirements and guide user expectations regarding GUIs). However, careful attention must be paid to the place of these services within the framework and the issues that arise.

One possible concern regards the substitutability of API functionality of other components through GUIs. For example, the API for a Measurement Point Service might define Start/Stop methods through a Web Services interface. If a GENI I&M system provides a GUI that allows the user to Start/Stop the MP, must the MP implement the same functionality (through the Web Services interface) to be considered compliant? Consider now the more complex example of GUIs that are the sole interface to a given dataset. Data might be pushed or pulled into a local (or remote) database which is then accessed by the GUI to display the data to the user (e.g. MRTG+Cacti). The data is clearly available to the experimenter through the GUI, but must it also be available through the defined MDA interface? Would it suffice for the GUI to be able to export the raw data in a give format (e.g. through HTTP file download)?

The access of GUIs to data raises very important issues regarding authentication and authorization in GENI. All of the GENI facilities must employ security mechanisms to ensure privacy and policy constraints are met. If a GUI has direct access to data, the GUI must likely employ the similar (same?) mechanisms as required on an equivalent MDA. On the other hand, if the GUI accesses other I&M services on demand to retrieve the d ata to be displayed, the GUI should allow the user to authenticate itself. In this case, should the user authenticate itself with the GUI, which is then trusted by the other services [this might make sense when the GUI is deployed as part of the I&M system]? Or should the GUI relay the authentication to the services by asking the user for its certificate (and password)? Both cases raise trust issues with the GUI themselves.

Finally, we address the discovery of and access to the GUI themselves. Some GUIs will likely act as services of the I&M systems (i.e. deployed within the system, maybe with direct access to data), and as such should likely be discoverable as any other service of I&M system (e.g. by registering to a Lookup Service). Open issues include determining the necessary information to register in order to meaningfully describe the GUI and its capabilities. Also, it is expected that many GUIs will be developed through the years by GENI users. GENI should likely provide an archive/repository to make these GUIs available and easily discoverable by the larger GENI community. Is this repository a centralized location (maintained by GENI) or just a Lookup Service pointing to the remote locations where the GUIs can be found?

Topic 9 GENI Measurement Data Schema

Team members:

Bruce Maggs – Duke University and Akamai (yes)
Max Ott – NICTA (yes, by phone)
Ivan Seskar – Rutgers WINLAB (yes)
Martin Swany* - Univ Delaware (yes)
Camilo Viecco - Indiana Univ (yes)
Michael Zink - UMass Amherst (yes)
Jim French - CNRI (yes)

*agreed to organize first discussion and writing

Consider:

Measurement data schema
Metadata schema
Metadata contents

Consider measurement data schema and/or metadata schema from:

perfSONAR
GMOC-provided
Current OML
Proposed using IPFIX
NetCDF (as used by DI Cloud)

Consider: Minimum set required for GENI

Provide overall template for GENI metadata, considering above.
Which items in GENI metadata template are:

Required?
Invariant?

Project Summaries

Instrumentation Tools

OMF/OML

perfSONAR

Scalable Sensing Service

On Time Measure

Data Intensive Cloud

Digital Object Registry

References

All references

Individual references

GIMS_Design_UseCases "Use-cases for GENI Instrumentation and Measurement Architecture Design"

MeasPlane-1 "RESTful Web Services vs. "Big" Web Services: Making the Right Architectural Decision"

OMF-OML-1: "XDR: External Data Representation Standard"

OMF-OML-2 "ORBIT Measurements Framework and Library (OML): Motivations, Design, Implementation, and Features"

OMF-OML-3 "OML Overview" Slides

OMF-OML-4 "Measurement Architectures for Network Experiments with Disconnected Mobile Nodes"

InsTools-1 "Architectural Design and Specification of the INSTOOLS Measurement System"

perfSONAR-1 "Scalable Framework for Representation and Exchange of Network Measurements"

perfSONAR-2 "An Extensible Schema for Network Measurement and Performance Data"

perfSONAR-3 "NM-WG/perfSONAR Topology Schema"

GMOC-1 “GMOC Topology-Entity Data Exchange Format Specification”

GMOC-2 "Proposal: Use of URN's as GENI Identifiers"

Figures

Figure 1-1: I&M Services for Researchers

Figure 1-2: I&M Services for Operators

Figure 1-3: I&M Services for both Researchers and Operators

Figure 2-1: OMF/OML I&M Srvcs

Figure 2-2: Inst Tools I&M Srvcs

Figure 2-3: perfSONAR I&M srvcs

Figure 2-4: Scal Sense Srvc I&M Srvcs

Figure 2-5: OnTimeMeas I&M Srvcs

Figure 2-6: DICLOUD config

Figure 2-7: DOR MDA Srvc and Mess

Figure 3-1: Meas Traffic Flows

Figure 3-2: Meas Traffic Proxies

Figure 4-1: OMF/OML Srvcs and Mess

Figure 4-2: OML Component Arch

Figure 4-3: OMF/OML Overview

Figure 4-4: ORBIT Network Diagram

Figure 5-1: Inst Tools Srvcs and Mess

Figure 5-2: Inst Tools Components

Figure 5-3: Inst Tools Topology

Figure 6-1: perfSONAR Srvcs and Messages

Figure 6-2: perfSONAR Meas Data Schema

Figure 7-1: Scal Sens Srvc Srvcs and Messages

Figure 8-1: OnTimeMeas Srvcs and Mess

Figure 9-1:

Figure 10-1: DOR MDA Srvc and Mess

Figure 10-2: DOR MDA Srvc File Org

Attachments (78)