wiki:Gec7InstMeasWGAgenda

Version 43 (modified by hmussman@bbn.com, 14 years ago) (diff)

--

I&M WG meeting at GEC7: Agenda and Notes

Wednesday, March 17, 2010, 3:30pm - 5:30pm
Room: Presidents 2

Introductions

3:30pm All

Major WG issues and goals

3:35pm Paul Barford (University of Wisconsin)
slides (15min)

Goals:
+ An infrastructure for gathering, analyzing and archiving measurements in GENI

  • Support for broad range of experiments
  • Tight integration with control framework(s)
  • Efficient, easy to use, broadly deployed, diverse capabilities, secure, privacy-aware, etc.

+ WG facilitates communication and coordination between I & M projects

  • Toward reaching overall goals quickly and effectively
  • Includes architectures, specifications, partnerships

Issues:
+ Architectural (v0.1 on wiki)

  • Use cases
  • Schema
  • Sensors, components and protocols
  • UI and integration
  • Authentication and privacy
  • torage and analysis

+ Practical

  • Test and evaluation
  • Deployment, configuration and support

Project Reviews

Each of the following speakers was asked to include in their talk a review of how they address the following GENI I&M architecture priority topics:
+ Common terminology; best granularity of functions
+ Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data?
+ Measurement plane; options; expect nodes with 3 or 2 NICs

Instrumentation Tools Project (1642)

3:50pm Jim Griffioen (University of Kentucky)
slides

Goals:
+ Integrate Univ of Kentucky Emulab into ProtoGENI (completed in year 1)
+ Reimplement Univ of Kentucky Edulab instrumentation and measurement tools to work in the ProtoGENI environment.
+ Support automatic generation of instrumentation and measurement infrastructure on a per-slice basis.

Structure:
+ Measurement Points
+ Measurement Controller

Functional Components

  1. Setup: deploy and initialize topology-specific software and services
  2. Capture: capture measurement data
  3. Collection: move data to processing/storage environments
  4. Storage: store data on a temporary, short term, long term, and archival basis
  5. Processing: filter, convert, aggregate, summarize, etc., data
  6. Presentation: present data to users in meaningful ways
  7. Access Protection: protect resources and data
  8. Measurement Control: Dynamically control the above components

Note: Conventional network management solutions exist for 2. through 8.

Mapping functions to proposed services:

  1. Setup – MO?
  2. Capture - MP
  3. Control – MO?
  4. Collection - MC
  5. Storage - MDA
  6. Processing - MAP
  7. Access Control – MO?
  8. Presentation - MAP

Measurement data schema:
+ Data-specific formats will exist at multiple times and places but should be invisible to the generic measurement infrastructure
+ Should leverage expertise of those with experience in this area (DatCat, PerfSONAR, etc).

Measurement plane:
+ Virtual interfaces abound in GENI.

  • Unknown how virtual interfaces map to physical NICs.

+ Shared paths are OK, but require QoS.

  • Setting up QoS planes is challenging; particularly if planes are per-slice.

OMF/OML Project (1660)

4:05pm Max Ott (NICTA)
slides

Goals:

  • All experiment output in one place
  • Capturing everything – most importantly meta data
  • Separation of concerns

– Instrumenting
– Collecting

  • Minimizing measurement collection overhead

– Time
– Traffic interference

  • Support for steerable experiments

– Access to data in different places

Concepts:
+ MPoints
+ Filters
– Stddev
– Average
– First
– Histogram
+ Processing/Caching
+ Steering or Feedback
+ MStreams
+ Dynamic Schema
+ Visualization

Supported applications:

  • Traffic Generation/Measurements

– OTG … Traffic Generator
– Iperf

  • Monitoring

– Libtrace
– Libsigar
– Spectrum Analyzer
– GPS
– (Weather)

  • Components

– TinyOS/Motes
– (GnuRadio)

LAMP using perfSONAR (1788)

4:20pm Guilherme Fernandes (for Martin Swany) (University of Delaware)
slides
+ perfSONAR is a multi-domain performance monitoring framework, which defines a set of protocol standards for sharing data between measurement and monitoring systems

Architecture:
+ Interoperable network measurement middleware designed as a Service Oriented Architecture (SOA):

  • Components are Web Services (WS) based

+ Several unique components and design considerations, all of which operate in a cooperative yet independent manner

  • Each functionality is separated into a specific function
  • Clients and servers interact through scripted, XML Based protocols
  • Measurement data is encoded in expressive XML formats

Components:
+ Infrastructure

  • Lookup Service
  • Topology Service
  • Authentication Service

+ Services

  • Measurement Point (MP) Service
  • Measurement Archive (MA) Service
  • Resource protector

+ Analysis and visualization

Open protocols and schema:
+ Base network measurement schema

  • OGF Network Measurement Working Group

+ Topology Schema

  • OGF Network Markup Language (NML-)WG
  • Includes Topology Network ID

+ perfSONAR Protocol Documents

  • OGF Network Measurement and Control (NMC-)WG

Base network measurement schema:
+ Measurement Data, a set of measurement events that have some value or values at a particular time
+ Measurement Metadata, the details about the set of measurement data

Measurement metadata:
+ Subject (Noun)

  • The measured/tested entity (who)
  • E.g. A pair of hosts (end-point-pair), or a Layer 3 interface

+ EventType (Verb)

  • What type of measurement, value, or event occurred
  • Characteristic, tool output, or generic event
  • E.g. latency, bandwidth, utilization, or simply iperf

+ Parameters (Adjectives and Adverbs)

  • How, or under what conditions, did this event occur?
  • E.g. buffer sizes used, TCP vs ICMP packets

+ Key

  • Shortcut substituted in place of previous three items
  • No predefined format

Measurement data: + Datum: The actual result (values) of measurement.

  • Can contain time (e.g. a Time element or attribute).
  • Existence of an event might point to the case where there no additional value
  • As in “Link up/down” or threshold events

+ Time: Representation of a time stamp or time range in a specified format.

  • Must be extensible since even agreement about the right structure is not easy, e.g. UNIX timestamp vs NTP time

Schema namespaces and extensibility:
+ A namespace: http://ggf.org/ns/nmwg/base/2.0/[[BR]]

  • MAY NOT be a URL

+ We encode the measurement/event type in the namespace (and as a standalone element)
+ We use Data and Metadata elements and vary the namespaces of the specific elements
+ Extensibility achieved through hierarchy with delegation

  • Similar to OIDs in the IETF management world

+ The NM-WG has a hierarchy of network characteristics

+ However, not all tools are cleanly mapped onto the Characteristic space

  • Often a matter of some debate

+ Organization-rooted tools namespace addresses this

Topology Schema:
+ Topology schema grew from network measurement description

  • Reusable “Subject” elements for common cases
  • Also reduces redundancy
  • Relationships between measurement Subjects

+ Structured by layers and the same elements recurring there (Base, L2, L3, L4)

  • networks as graphs

+ Elements:

  • Domain
  • Node
  • Port
  • Link
  • Network
  • Path
  • Service

+ Varied by namespaces (extensibility)

  • Reuse visualization logic, etc.
  • Validate layer- or technology-specific attributes

+ Used by perfSONAR, IDC Protocol (ION, OSCARS, AutoBAHN), Phoebus

  • Currently calling it the UNIS Topology Schema

+ OGF NML-WG to unify NDL and UNIS Topology schema

  • Happening as we speak at OGF28

LAMP objectives:
+ Collaborate on defining a common but extensible format for data storage and exchange for GENI I&M systems

  • Use perfSONAR NM-WG schema as starting point
  • Identify new characteristics/tools namespaces

+ Develop a representation of GENI topology to be used to describe measurements and experiment configuration

  • UNIS topology schema can be easily extended

+ Collaborate with related GENI measurement and security projects on a common GENI I&M architecture

  • The new GENI I&M Arch. Draft defines very similar services (MP, MC, MDA, MAP), and new ones (MO)
  • perfSONAR is a good starting point, not currently a final solution (for GENI);
  • Use cases have been different, but much can be reused and the framework can be extended

DatCat Project

4:35pm Brad Huffaker (CAIDA, UC San Diego Supercomputing Center)
slides

Goal:
+ DatCat was designed to improve data sharing by providing a unified metadata database for Internet data.
+ Make easy for users:

  • finding data sets of interest
  • adding new data sets to the catalog
  • annotating data sets in the catalog

+ DOES NOT store data

Database scheme:
+ Collection

  • logical group of files(paper, project,...)

+ Data files

  • raw data files (traceroutes, logs dumps, ...)

+ Packages

  • downloadable files (single file, tarball, ...)

+ Locations

  • how to get the packages (URL, contact address, ...)

Annotations:
+ Provide an extensible naming space for assigning domain specific values to files.
+ each user has their own hierarchical name space

  • passive.IPv4.packet_count
  • active.RTT_95th_percentile

+ both data contributors and general DatCat users may attach annotations
+ any user may assign “note” annotations to any object

Metadata fields:
+ collection

  • fields: name, contents, summary, motivation, creators/primary contact/contributor, start/end time, keywords, short description/description/description URL
  • annotations: note

+ data

  • fields: name, creators/primary contact/contributor, keywords,format, file size, start/end time, duration, geographic/network location, time zone, MD5, description, creation process
  • annotations: passive.IPv4.packet_count, passive.IPv4.TCP.dst.port_count, cfg.passive.capture_len, AS_count, active.trace_count, active.RTT_10th_percentile, .....

+ location

  • fields: package, creators, primary contact, status, download procedure, download URL, geographic/logistic location, availability

Submission tools:
+ Perl API

  • useful for integrating into existing data management systems
  • flexible, but need to write code:

+ subcat

  • different approach (declarative)
  • describe metadata in human-friendly text files (YAML)
  • CAIDA provides tools to extract additional metadata (data-to-yaml)
  • subcat intuitively joins information together

DatCat web portal:
+ Browse collections
+ Search collections
+ Search data

Lessons learned:
+ file-level metadata hard

  • hard to fix errors across thousands of files
  • hard to display thousands of files
  • hard to generate

+ submission process too cumbersome for most users

  • majority of metadata is shared between files, creator, creation process, location, etc
  • many researchers are not programmers
  • researchers have limited time and motivation

+ Lots of redundant information:

  • For a single contribution, a majority of data objects have identical metadata shared across a large number of data objects.
  • could be solved by pushing subcat-type categories into the database

+ Move to stand-alone collections

  • contributors will only need to fill in the collection information
  • shorten search path from collection to locations

+ better to have lots of collections, than lots of files

GENI I&M Architecture

4:55pm Harry Mussman (GPO) slides
GENI I&M Architecture document (15min)
v0.1 DRAFT includes proposed I&M services and proposed configuration

Purpose: Provide a comprehensive and ordered list of topics that must be addressed for a complete architecture Identify the priority topics that the WG needs to address first Pull together contributions by the WG though Spiral 2

Plan: Now : v0.1 DRAFT completed, by GPO; see http://groups.geni.net/geni/wiki/GeniInstrumentationandMeasurementsArchitecture By GEC8: v0.5 draft, by GPO, with contributions from WG By GEC9: v1.0 draft, reviewed by WG

Document outline: Document Scope

  1. Introduction
  2. Definition and configuration of I&M services
  3. Interfaces, protocols and schema for Measurement Data (MD)
  4. Ownership of MD and privacy of owners
  5. Interfaces, protocols and APIs for using I&M services
  6. Basic GENI I&M use cases
  7. MD transport via the GENI Measurement Plane
  8. Discovery, authorization, assignment and binding of GENI I&M services
  9. Measurement Orchestration (MO) service
  10. Measurement Point (MP)
  11. Time-stamping MD
  12. Measurement Collection (MC) service
  13. Measurement Analysis and Presentation (MAP) service
  14. Measurement Data Archive (MDA) service
  15. Additional GENI I&M use cases

Based on GENI I&M Capabilities Catalog (v0.1), these GENI projects have comprehensive, end-to-end capabilities: OML (ORBIT Measure Library) in OMF (ORBIT Mgmt Framework)

(Ott, NICTA and Gruteser, WINLAB/Rutgers, 1660)

Instrumentation Tools

(Griffioen, Univ Kentucky, 1642)

perfSONAR for network measurements

(Zekauskas, I2 and Swany, Univ Delaware, 1788)

Scalable Sensing Service

(Fahmy, Purdue and Sharma, HP Labs, 1723)

OnTimeMeasure

(Calyam, Ohio Super Ctr, 1764)

After considering projects with comprehensive, end-to-end capabilities, here are five services they have in common:

Measurement Orchestration (MO) service (p/o Experiment Control service, uses a language to orchestrate I&M services) Measurement Point (MP) service (instrumentation that taps into a network and/or systems, links and/or nodes, to capture measurement data and format it using a standardized schema) Measurement Collection (MC) service (programmable systems that collect, combine, transform and cache measurement data) Measurement Analysis and Presentation (MAP) service (programmable systems that analyze and then present measurement data) Measurement Data Archive (MDA) service (measurement data repository, index and portal)

Expected range of implementations:

Small-scale implementations might put all I&M services within one aggregate, and even in one server interfaces between services would be internal to the aggregate, or even internal to the server

Large-scale implementations might have I&M services distributed over many aggregates with measurement data flowing between services with orchestration mechanisms based upon message exchanges

Discussion topics: Are these five services a complete group of I&M services?

Are these good names for the five I&M services?

Is this five the right granularity for I&M services?

Is this a complete and flexible configuration for I&M services?

Can this configuration accommodate the range from small-scale to large-scale implementations?

How can we obtain a consensus, so that we can set a firm foundation for the other topics?

Interfaces, protocols and schema for measurement data:

Issues: This topic suggested at GEC6 meeting: Common schema for MD Can we identify a common set of interfaces, protocols and schema for MD, or at least a limited number of types? What needs to be included in the MD schema?

Approach: Assume all MD after MPs follows this common set of interfaces, protocols and schema Start with definition of MD schema Next, understand [8. MD Transport via GENI Measurement Plane] Then, complete first set of interfaces and protocols

From GENI I&M Capabilities Catalog (v0.1), these GENI projects (and others) are working on data schema and/or data archives:

perfSONAR for network measurements (Swany, Univ Delaware, 1788) IMF project (Dutta, NC State, 1718) Embedded Real-Time Measurements (Bergman, Columbia, 1631) GENI Meta-Operations Center (Herron, Indiana Univ, 1604) netKarma: GENI Provenance Registry (Pale and Small, Indiana Univ, 1706) DatCat project at http://www.datcat.org/ (Klaffy, CAIDA) Crawdad project at http://crawdad.cs.dartmouth.edu/ (Kotz, Dartmouth) Amazon Simple Storage Service Data-Intensive Cloud Control (Zink and Cecchet, UMass Amherst, 1709 ) Experiment Mgmt System (Lannom and Manepalli, CNRI, 1663) others?

What can we learn from these projects?

Discussion topics:

Standardized interfaces between measurement services Pt-to-pt vs pt-to-multipoint (e.g., pub/sub) Stream vs bulk transfer Disconnection operation expected, or not.

Protocols for moving measurement data Streaming data Bulk-transfer of data

Schema for measurement data Data record identifier Annotation, or meta data Data types and values, with timestamps

How can we obtain a consensus on first set of intfc’s/protocols/schema for MD?

What is the process for extending the set?

GENI measurement plane:

Issue: Need to understand how MD traffic flows are transported by the GENI Measurement Plane before the interfaces and protocols for MD can be fully defined

Approach: Understand current view of GENI Control Plane and Experiment Plane Consider options for GENI Measurement Plane to transport MD flows, using networks that implement GENI Control and Experiment Planes

Priority topics:
+ Common terminology; best granularity of functions
+ Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data?
+ Measurement Plane; options; expect nodes with 3 or 2 NICs

Next Steps for WG

5:10pm Bruce Maggs (Duke University)

Wrap-up, review of action items and issues for planary

5:25pm Harry Mussman (GPO)
slides

5:30pm Adjourn

6:30pm BoF dinner, organized by Harry Mussman, location Parizade Restaurant

Attachments (7)