wiki:Gec7InstMeasWGAgenda

Version 41 (modified by hmussman@bbn.com, 14 years ago) (diff)

--

I&M WG meeting at GEC7: Agenda and Notes

Wednesday, March 17, 2010, 3:30pm - 5:30pm
Room: Presidents 2

Introductions

3:30pm All

Major WG issues and goals

3:35pm Paul Barford (University of Wisconsin)
slides (15min)

Goals:
+ An infrastructure for gathering, analyzing and archiving measurements in GENI

  • Support for broad range of experiments
  • Tight integration with control framework(s)
  • Efficient, easy to use, broadly deployed, diverse capabilities, secure, privacy-aware, etc.

+ WG facilitates communication and coordination between I & M projects

  • Toward reaching overall goals quickly and effectively
  • Includes architectures, specifications, partnerships

Issues:
+ Architectural (v0.1 on wiki)

  • Use cases
  • Schema
  • Sensors, components and protocols
  • UI and integration
  • Authentication and privacy
  • torage and analysis

+ Practical

  • Test and evaluation
  • Deployment, configuration and support

Project Reviews

Each of the following speakers was asked to include in their talk a review of how they address the following GENI I&M architecture priority topics:
+ Common terminology; best granularity of functions
+ Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data?
+ Measurement plane; options; expect nodes with 3 or 2 NICs

Instrumentation Tools Project (1642)

3:50pm Jim Griffioen (University of Kentucky)
slides

Goals:
+ Integrate Univ of Kentucky Emulab into ProtoGENI (completed in year 1)
+ Reimplement Univ of Kentucky Edulab instrumentation and measurement tools to work in the ProtoGENI environment.
+ Support automatic generation of instrumentation and measurement infrastructure on a per-slice basis.

Structure:
+ Measurement Points
+ Measurement Controller

Functional Components

  1. Setup: deploy and initialize topology-specific software and services
  2. Capture: capture measurement data
  3. Collection: move data to processing/storage environments
  4. Storage: store data on a temporary, short term, long term, and archival basis
  5. Processing: filter, convert, aggregate, summarize, etc., data
  6. Presentation: present data to users in meaningful ways
  7. Access Protection: protect resources and data
  8. Measurement Control: Dynamically control the above components

Note: Conventional network management solutions exist for 2. through 8.

Mapping functions to proposed services:

  1. Setup – MO?
  2. Capture - MP
  3. Control – MO?
  4. Collection - MC
  5. Storage - MDA
  6. Processing - MAP
  7. Access Control – MO?
  8. Presentation - MAP

Measurement data schema:
+ Data-specific formats will exist at multiple times and places but should be invisible to the generic measurement infrastructure
+ Should leverage expertise of those with experience in this area (DatCat, PerfSONAR, etc).

Measurement plane:
+ Virtual interfaces abound in GENI.

  • Unknown how virtual interfaces map to physical NICs.

+ Shared paths are OK, but require QoS.

  • Setting up QoS planes is challenging; particularly if planes are per-slice.

OMF/OML Project (1660)

4:05pm Max Ott (NICTA)
slides

Goals:

  • All experiment output in one place
  • Capturing everything – most importantly meta data
  • Separation of concerns

– Instrumenting
– Collecting

  • Minimizing measurement collection overhead

– Time
– Traffic interference

  • Support for steerable experiments

– Access to data in different places

Concepts:
+ MPoints
+ Filters
– Stddev
– Average
– First
– Histogram
+ Processing/Caching
+ Steering or Feedback
+ MStreams
+ Dynamic Schema
+ Visualization

Supported applications:

  • Traffic Generation/Measurements

– OTG … Traffic Generator
– Iperf

  • Monitoring

– Libtrace
– Libsigar
– Spectrum Analyzer
– GPS
– (Weather)

  • Components

– TinyOS/Motes
– (GnuRadio)

LAMP using perfSONAR (1788)

4:20pm Guilherme Fernandes (for Martin Swany) (University of Delaware)
slides
+ perfSONAR is a multi-domain performance monitoring framework, which defines a set of protocol standards for sharing data between measurement and monitoring systems

Architecture:
+ Interoperable network measurement middleware designed as a Service Oriented Architecture (SOA):

  • Components are Web Services (WS) based

+ Several unique components and design considerations, all of which operate in a cooperative yet independent manner

  • Each functionality is separated into a specific function
  • Clients and servers interact through scripted, XML Based protocols
  • Measurement data is encoded in expressive XML formats

Components:
+ Infrastructure

  • Lookup Service
  • Topology Service
  • Authentication Service

+ Services

  • Measurement Point (MP) Service
  • Measurement Archive (MA) Service
  • Resource protector

+ Analysis and visualization

Open protocols and schema:
+ Base network measurement schema

  • OGF Network Measurement Working Group

+ Topology Schema

  • OGF Network Markup Language (NML-)WG
  • Includes Topology Network ID

+ perfSONAR Protocol Documents

  • OGF Network Measurement and Control (NMC-)WG

Base network measurement schema:
+ Measurement Data, a set of measurement events that have some value or values at a particular time
+ Measurement Metadata, the details about the set of measurement data

Measurement metadata:
+ Subject (Noun)

  • The measured/tested entity (who)
  • E.g. A pair of hosts (end-point-pair), or a Layer 3 interface

+ EventType (Verb)

  • What type of measurement, value, or event occurred
  • Characteristic, tool output, or generic event
  • E.g. latency, bandwidth, utilization, or simply iperf

+ Parameters (Adjectives and Adverbs)

  • How, or under what conditions, did this event occur?
  • E.g. buffer sizes used, TCP vs ICMP packets

+ Key

  • Shortcut substituted in place of previous three items
  • No predefined format

Measurement data: + Datum: The actual result (values) of measurement.

  • Can contain time (e.g. a Time element or attribute).
  • Existence of an event might point to the case where there no additional value
  • As in “Link up/down” or threshold events

+ Time: Representation of a time stamp or time range in a specified format.

  • Must be extensible since even agreement about the right structure is not easy, e.g. UNIX timestamp vs NTP time

Schema namespaces and extensibility:
+ A namespace: http://ggf.org/ns/nmwg/base/2.0/[[BR]]

  • MAY NOT be a URL

+ We encode the measurement/event type in the namespace (and as a standalone element)
+ We use Data and Metadata elements and vary the namespaces of the specific elements
+ Extensibility achieved through hierarchy with delegation

  • Similar to OIDs in the IETF management world

+ The NM-WG has a hierarchy of network characteristics

+ However, not all tools are cleanly mapped onto the Characteristic space

  • Often a matter of some debate

+ Organization-rooted tools namespace addresses this

Topology Schema:
+ Topology schema grew from network measurement description

  • Reusable “Subject” elements for common cases
  • Also reduces redundancy
  • Relationships between measurement Subjects

+ Structured by layers and the same elements recurring there (Base, L2, L3, L4)

  • networks as graphs

+ Elements:

  • Domain
  • Node
  • Port
  • Link
  • Network
  • Path
  • Service

+ Varied by namespaces (extensibility)

  • Reuse visualization logic, etc.
  • Validate layer- or technology-specific attributes

+ Used by perfSONAR, IDC Protocol (ION, OSCARS, AutoBAHN), Phoebus

  • Currently calling it the UNIS Topology Schema

+ OGF NML-WG to unify NDL and UNIS Topology schema

  • Happening as we speak at OGF28

LAMP objectives:
+ Collaborate on defining a common but extensible format for data storage and exchange for GENI I&M systems

  • Use perfSONAR NM-WG schema as starting point
  • Identify new characteristics/tools namespaces

+ Develop a representation of GENI topology to be used to describe measurements and experiment configuration

  • UNIS topology schema can be easily extended

+ Collaborate with related GENI measurement and security projects on a common GENI I&M architecture

  • The new GENI I&M Arch. Draft defines very similar services (MP, MC, MDA, MAP), and new ones (MO)
  • perfSONAR is a good starting point, not currently a final solution (for GENI);
  • Use cases have been different, but much can be reused and the framework can be extended

DatCat Project

4:35pm Brad Huffaker (CAIDA, UC San Diego Supercomputing Center)
slides

Goal:
+ DatCat was designed to improve data sharing by providing a unified metadata database for Internet data.
+ Make easy for users:

  • finding data sets of interest
  • adding new data sets to the catalog
  • annotating data sets in the catalog

+ DOES NOT store data

Database scheme:
+ Collection

  • logical group of files(paper, project,...)

+ Data files

  • raw data files (traceroutes, logs dumps, ...)

+ Packages

  • downloadable files (single file, tarball, ...)

+ Locations

  • how to get the packages (URL, contact address, ...)

Annotations:
+ Provide an extensible naming space for assigning domain specific values to files.
+ each user has their own hierarchical name space

  • passive.IPv4.packet_count
  • active.RTT_95th_percentile

+ both data contributors and general DatCat users may attach annotations
+ any user may assign “note” annotations to any object

Metadata fields:
+ collection

  • fields: name, contents, summary, motivation, creators/primary contact/contributor, start/end time, keywords, short description/description/description URL
  • annotations: note

+ data

  • fields: name, creators/primary contact/contributor, keywords,format, file size, start/end time, duration, geographic/network location, time zone, MD5, description, creation process
  • annotations: passive.IPv4.packet_count, passive.IPv4.TCP.dst.port_count, cfg.passive.capture_len, AS_count, active.trace_count, active.RTT_10th_percentile, .....

+ location

  • fields: package, creators, primary contact, status, download procedure, download URL, geographic/logistic location, availability

Submission tools:
+ Perl API

  • useful for integrating into existing data management systems
  • flexible, but need to write code:

+ subcat

  • different approach (declarative)
  • describe metadata in human-friendly text files (YAML)
  • CAIDA provides tools to extract additional metadata (data-to-yaml)
  • subcat intuitively joins information together

DatCat web portal:
+ Browse collections
+ Search collections
+ Search data

Lessons learned:
+ file-level metadata hard

  • hard to fix errors across thousands of files
  • hard to display thousands of files
  • hard to generate

+ submission process too cumbersome for most users

  • majority of metadata is shared between files, creator, creation process, location, etc
  • many researchers are not programmers
  • researchers have limited time and motivation

+ Lots of redundant information:

  • For a single contribution, a majority of data objects have identical metadata shared across a large number of data objects.
  • could be solved by pushing subcat-type categories into the database

+ Move to stand-alone collections

  • contributors will only need to fill in the collection information
  • shorten search path from collection to locations

+ better to have lots of collections, than lots of files

GENI I&M Architecture

4:55pm Harry Mussman (GPO) slides
GENI I&M Architecture document (15min)
v0.1 DRAFT includes proposed I&M services and proposed configuration

Priority topics:
+ Common terminology; best granularity of functions
+ Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data?
+ Measurement Plane; options; expect nodes with 3 or 2 NICs

Next Steps for WG

5:10pm Bruce Maggs (Duke University)

Wrap-up, review of action items and issues for planary

5:25pm Harry Mussman (GPO)
slides

5:30pm Adjourn

6:30pm BoF dinner, organized by Harry Mussman, location Parizade Restaurant

Attachments (7)