Version 44 (modified by 15 years ago) (diff) | ,
---|
I&M WG meeting at GEC7: Agenda and Notes
Wednesday, March 17, 2010, 3:30pm - 5:30pm
Room: Presidents 2
Introductions
3:30pm All
Major WG issues and goals
3:35pm Paul Barford (University of Wisconsin)
slides (15min)
Goals:
+ An infrastructure for gathering, analyzing and archiving measurements in GENI
- Support for broad range of experiments
- Tight integration with control framework(s)
- Efficient, easy to use, broadly deployed, diverse capabilities, secure, privacy-aware, etc.
+ WG facilitates communication and coordination between I & M projects
- Toward reaching overall goals quickly and effectively
- Includes architectures, specifications, partnerships
Issues:
+ Architectural (v0.1 on wiki)
- Use cases
- Schema
- Sensors, components and protocols
- UI and integration
- Authentication and privacy
- torage and analysis
+ Practical
- Test and evaluation
- Deployment, configuration and support
Project Reviews
Each of the following speakers was asked to include in their talk a review of how they address the following GENI I&M architecture priority topics:
+ Common terminology; best granularity of functions
+ Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data?
+ Measurement plane; options; expect nodes with 3 or 2 NICs
Instrumentation Tools Project (1642)
3:50pm Jim Griffioen (University of Kentucky)
slides
Goals:
+ Integrate Univ of Kentucky Emulab into ProtoGENI (completed in year 1)
+ Reimplement Univ of Kentucky Edulab instrumentation and measurement tools to work in the ProtoGENI environment.
+ Support automatic generation of instrumentation and measurement infrastructure on a per-slice basis.
Structure:
+ Measurement Points
+ Measurement Controller
Functional Components
- Setup: deploy and initialize topology-specific software and services
- Capture: capture measurement data
- Collection: move data to processing/storage environments
- Storage: store data on a temporary, short term, long term, and archival basis
- Processing: filter, convert, aggregate, summarize, etc., data
- Presentation: present data to users in meaningful ways
- Access Protection: protect resources and data
- Measurement Control: Dynamically control the above components
Note: Conventional network management solutions exist for 2. through 8.
Mapping functions to proposed services:
- Setup – MO?
- Capture - MP
- Control – MO?
- Collection - MC
- Storage - MDA
- Processing - MAP
- Access Control – MO?
- Presentation - MAP
Measurement data schema:
+ Data-specific formats will exist at multiple times and places but should be invisible to the generic measurement infrastructure
+ Should leverage expertise of those with experience in this area (DatCat, PerfSONAR, etc).
Measurement plane:
+ Virtual interfaces abound in GENI.
- Unknown how virtual interfaces map to physical NICs.
+ Shared paths are OK, but require QoS.
- Setting up QoS planes is challenging; particularly if planes are per-slice.
OMF/OML Project (1660)
4:05pm Max Ott (NICTA)
slides
Goals:
- All experiment output in one place
- Capturing everything – most importantly meta data
- Separation of concerns
– Instrumenting
– Collecting
- Minimizing measurement collection overhead
– Time
– Traffic interference
- Support for steerable experiments
– Access to data in different places
Concepts:
+ MPoints
+ Filters
– Stddev
– Average
– First
– Histogram
+ Processing/Caching
+ Steering or Feedback
+ MStreams
+ Dynamic Schema
+ Visualization
Supported applications:
- Traffic Generation/Measurements
– OTG … Traffic Generator
– Iperf
- Monitoring
– Libtrace
– Libsigar
– Spectrum Analyzer
– GPS
– (Weather)
- Components
– TinyOS/Motes
– (GnuRadio)
LAMP using perfSONAR (1788)
4:20pm Guilherme Fernandes (for Martin Swany) (University of Delaware)
slides
+ perfSONAR is a multi-domain performance monitoring framework, which defines a set of protocol standards for sharing data between measurement and monitoring systems
Architecture:
+ Interoperable network measurement middleware designed as a Service Oriented Architecture (SOA):
- Components are Web Services (WS) based
+ Several unique components and design considerations, all of which operate in a cooperative yet independent manner
- Each functionality is separated into a specific function
- Clients and servers interact through scripted, XML Based protocols
- Measurement data is encoded in expressive XML formats
Components:
+ Infrastructure
- Lookup Service
- Topology Service
- Authentication Service
+ Services
- Measurement Point (MP) Service
- Measurement Archive (MA) Service
- Resource protector
+ Analysis and visualization
Open protocols and schema:
+ Base network measurement schema
- OGF Network Measurement Working Group
+ Topology Schema
- OGF Network Markup Language (NML-)WG
- Includes Topology Network ID
+ perfSONAR Protocol Documents
- OGF Network Measurement and Control (NMC-)WG
Base network measurement schema:
+ Measurement Data, a set of measurement events that have some value or values at a particular time
+ Measurement Metadata, the details about the set of measurement data
Measurement metadata:
+ Subject (Noun)
- The measured/tested entity (who)
- E.g. A pair of hosts (end-point-pair), or a Layer 3 interface
+ EventType (Verb)
- What type of measurement, value, or event occurred
- Characteristic, tool output, or generic event
- E.g. latency, bandwidth, utilization, or simply iperf
+ Parameters (Adjectives and Adverbs)
- How, or under what conditions, did this event occur?
- E.g. buffer sizes used, TCP vs ICMP packets
+ Key
- Shortcut substituted in place of previous three items
- No predefined format
Measurement data:
+ Datum: The actual result (values) of measurement.
- Can contain time (e.g. a Time element or attribute).
- Existence of an event might point to the case where there no additional value
- As in “Link up/down” or threshold events
+ Time: Representation of a time stamp or time range in a specified format.
- Must be extensible since even agreement about the right structure is not easy, e.g. UNIX timestamp vs NTP time
Schema namespaces and extensibility:
+ A namespace: http://ggf.org/ns/nmwg/base/2.0/[[BR]]
- MAY NOT be a URL
+ We encode the measurement/event type in the namespace (and as a standalone element)
+ We use Data and Metadata elements and vary the namespaces of the specific elements
+ Extensibility achieved through hierarchy with delegation
- Similar to OIDs in the IETF management world
+ The NM-WG has a hierarchy of network characteristics
- Good starting point
- E.g. http://ggf.org/ns/nmwg/characteristic/utilization/2.0[[BR]]
- E.g. http://ggf.org/ns/nmwg/characteristic/bandwidth/achievable/2.0[[BR]]
+ However, not all tools are cleanly mapped onto the Characteristic space
- Often a matter of some debate
+ Organization-rooted tools namespace addresses this
- Easy to add new tools in organization-specific namespaces
- E.g. http://ggf.org/ns/nmwg/tools/nuttcp/2.0[[BR]]
Topology Schema:
+ Topology schema grew from network measurement description
- Reusable “Subject” elements for common cases
- Also reduces redundancy
- Relationships between measurement Subjects
+ Structured by layers and the same elements recurring there (Base, L2, L3, L4)
- networks as graphs
+ Elements:
- Domain
- Node
- Port
- Link
- Network
- Path
- Service
+ Varied by namespaces (extensibility)
- Reuse visualization logic, etc.
- Validate layer- or technology-specific attributes
+ Used by perfSONAR, IDC Protocol (ION, OSCARS, AutoBAHN), Phoebus
- Currently calling it the UNIS Topology Schema
+ OGF NML-WG to unify NDL and UNIS Topology schema
- Happening as we speak at OGF28
LAMP objectives:
+ Collaborate on defining a common but extensible format for data storage and exchange for GENI I&M systems
- Use perfSONAR NM-WG schema as starting point
- Identify new characteristics/tools namespaces
+ Develop a representation of GENI topology to be used to describe measurements and experiment configuration
- UNIS topology schema can be easily extended
+ Collaborate with related GENI measurement and security projects on a common GENI I&M architecture
- The new GENI I&M Arch. Draft defines very similar services (MP, MC, MDA, MAP), and new ones (MO)
- perfSONAR is a good starting point, not currently a final solution (for GENI);
- Use cases have been different, but much can be reused and the framework can be extended
DatCat Project
4:35pm Brad Huffaker (CAIDA, UC San Diego Supercomputing Center)
slides
Goal:
+ DatCat was designed to improve data sharing by providing a unified metadata database for Internet data.
+ Make easy for users:
- finding data sets of interest
- adding new data sets to the catalog
- annotating data sets in the catalog
+ DOES NOT store data
Database scheme:
+ Collection
- logical group of files(paper, project,...)
+ Data files
- raw data files (traceroutes, logs dumps, ...)
+ Packages
- downloadable files (single file, tarball, ...)
+ Locations
- how to get the packages (URL, contact address, ...)
Annotations:
+ Provide an extensible naming space for assigning domain specific values to files.
+ each user has their own hierarchical name space
- passive.IPv4.packet_count
- active.RTT_95th_percentile
+ both data contributors and general DatCat users may attach annotations
+ any user may assign “note” annotations to any object
Metadata fields:
+ collection
- fields: name, contents, summary, motivation, creators/primary contact/contributor, start/end time, keywords, short description/description/description URL
- annotations: note
+ data
- fields: name, creators/primary contact/contributor, keywords,format, file size, start/end time, duration, geographic/network location, time zone, MD5, description, creation process
- annotations: passive.IPv4.packet_count, passive.IPv4.TCP.dst.port_count, cfg.passive.capture_len, AS_count, active.trace_count, active.RTT_10th_percentile, .....
+ location
- fields: package, creators, primary contact, status, download procedure, download URL, geographic/logistic location, availability
Submission tools:
+ Perl API
- useful for integrating into existing data management systems
- flexible, but need to write code:
+ subcat
- different approach (declarative)
- describe metadata in human-friendly text files (YAML)
- CAIDA provides tools to extract additional metadata (data-to-yaml)
- subcat intuitively joins information together
DatCat web portal:
+ Browse collections
+ Search collections
+ Search data
Lessons learned:
+ file-level metadata hard
- hard to fix errors across thousands of files
- hard to display thousands of files
- hard to generate
+ submission process too cumbersome for most users
- majority of metadata is shared between files, creator, creation process, location, etc
- many researchers are not programmers
- researchers have limited time and motivation
+ Lots of redundant information:
- For a single contribution, a majority of data objects have identical metadata shared across a large number of data objects.
- could be solved by pushing subcat-type categories into the database
+ Move to stand-alone collections
- contributors will only need to fill in the collection information
- shorten search path from collection to locations
+ better to have lots of collections, than lots of files
GENI I&M Architecture
4:55pm Harry Mussman (GPO)
slides
GENI I&M Architecture document (15min)
v0.1 DRAFT includes proposed I&M services and proposed configuration
Purpose: + Provide a comprehensive and ordered list of topics that must be addressed for a complete architecture + Identify the priority topics that the WG needs to address first + Pull together contributions by the WG though Spiral 2
Plan: + Now : v0.1 DRAFT completed, by GPO; see http://groups.geni.net/geni/wiki/GeniInstrumentationandMeasurementsArchitecture + By GEC8: v0.5 draft, by GPO, with contributions from WG + By GEC9: v1.0 draft, reviewed by WG
Document outline:
- Document Scope
- Introduction
- Definition and configuration of I&M services
- Interfaces, protocols and schema for Measurement Data (MD)
- Ownership of MD and privacy of owners
- Interfaces, protocols and APIs for using I&M services
- Basic GENI I&M use cases
- MD transport via the GENI Measurement Plane
- Discovery, authorization, assignment and binding of GENI I&M services
- Measurement Orchestration (MO) service
- Measurement Point (MP)
- Time-stamping MD
- Measurement Collection (MC) service
- Measurement Analysis and Presentation (MAP) service
- Measurement Data Archive (MDA) service
- Additional GENI I&M use cases
Based on GENI I&M Capabilities Catalog (v0.1), these GENI projects have comprehensive, end-to-end capabilities: + OML (ORBIT Measure Library) in OMF (ORBIT Mgmt Framework)
- (Ott, NICTA and Gruteser, WINLAB/Rutgers, 1660)
+ Instrumentation Tools
- (Griffioen, Univ Kentucky, 1642)
+ perfSONAR for network measurements
- (Zekauskas, I2 and Swany, Univ Delaware, 1788)
+ Scalable Sensing Service
- (Fahmy, Purdue and Sharma, HP Labs, 1723)
- (Calyam, Ohio Super Ctr, 1764)
After considering projects with comprehensive, end-to-end capabilities, here are five services they have in common: + Measurement Orchestration (MO) service
- (p/o Experiment Control service, uses a language to orchestrate I&M services)
+ Measurement Point (MP) service
- (instrumentation that taps into a network and/or systems, links and/or nodes, to capture measurement data and format it using a standardized schema)
+ Measurement Collection (MC) service
- (programmable systems that collect, combine, transform and cache measurement data)
+ Measurement Analysis and Presentation (MAP) service
- (programmable systems that analyze and then present measurement data)
+ Measurement Data Archive (MDA) service
- (measurement data repository, index and portal)
Expected range of implementations: + Small-scale implementations might put all I&M services within one aggregate, and even in one server
- interfaces between services would be internal to the aggregate, or even internal to the server
+ Large-scale implementations might have I&M services distributed over many aggregates
- with measurement data flowing between services
- with orchestration mechanisms based upon message exchanges
Discussion topics: + Are these five services a complete group of I&M services? + Are these good names for the five I&M services? + Is this five the right granularity for I&M services? + Is this a complete and flexible configuration for I&M services? + Can this configuration accommodate the range from small-scale to large-scale implementations? + How can we obtain a consensus, so that we can set a firm foundation for the other topics?
Interfaces, protocols and schema for measurement data: + Issues:
- This topic suggested at GEC6 meeting: Common schema for MD
- Can we identify a common set of interfaces, protocols and schema for MD, or at least a limited number of types?
- What needs to be included in the MD schema?
+ Approach:
- Assume all MD after MPs follows this common set of interfaces, protocols and schema
- Start with definition of MD schema
- Next, understand [8. MD Transport via GENI Measurement Plane]
- Then, complete first set of interfaces and protocols
From GENI I&M Capabilities Catalog (v0.1), these GENI projects (and others) are working on data schema and/or data archives: + perfSONAR for network measurements (Swany, Univ Delaware, 1788) + IMF project (Dutta, NC State, 1718) + Embedded Real-Time Measurements (Bergman, Columbia, 1631) + GENI Meta-Operations Center (Herron, Indiana Univ, 1604) + netKarma: GENI Provenance Registry (Pale and Small, Indiana Univ, 1706) + DatCat project at http://www.datcat.org/ (Klaffy, CAIDA) + Crawdad project at http://crawdad.cs.dartmouth.edu/ (Kotz, Dartmouth) + Amazon Simple Storage Service + Data-Intensive Cloud Control (Zink and Cecchet, UMass Amherst, 1709 ) + Experiment Mgmt System (Lannom and Manepalli, CNRI, 1663) + others?
- What can we learn from these projects?
Discussion topics: + Standardized interfaces between measurement services
- Pt-to-pt vs pt-to-multipoint (e.g., pub/sub)
- Stream vs bulk transfer
- Disconnection operation expected, or not.
+ Protocols for moving measurement data
- Streaming data
- Bulk-transfer of data
+ Schema for measurement data
- Data record identifier
- Annotation, or meta data
- Data types and values, with timestamps
+ How can we obtain a consensus on first set of intfc’s/protocols/schema for MD? + What is the process for extending the set?
GENI measurement plane: + Issue:
- Need to understand how MD traffic flows are transported by the GENI Measurement Plane before the interfaces and protocols for MD can be fully defined
+ Approach:
- Understand current view of GENI Control Plane and Experiment Plane
- Consider options for GENI Measurement Plane to transport MD flows, using networks that implement GENI Control and Experiment Planes
Next Steps for WG
5:10pm Bruce Maggs (Duke University)
Wrap-up, review of action items and issues for planary
5:25pm Harry Mussman (GPO)
slides
5:30pm Adjourn
6:30pm BoF dinner, organized by Harry Mussman, location Parizade Restaurant
Attachments (7)
- 031210 IM-ARCH-GEC7Slides.pdf (825.3 KB) - added by 15 years ago.
- barford gec7_imwg.pdf (32.1 KB) - added by 15 years ago.
- griffioen instools_imwg_gec7.pdf (213.8 KB) - added by 15 years ago.
- huffaker dat cat genie-gec07-201003.pdf (2.1 MB) - added by 15 years ago.
- ott OML - GEC7 - march.pdf (456.9 KB) - added by 15 years ago.
- fernandes swany GEC7-LAMP.ppt (601.0 KB) - added by 15 years ago.
- 031710 WGInstAndMeas_Report.ppt (228.5 KB) - added by 15 years ago.