[[PageOutline]] = I&M WG meeting at GEC7: Agenda and Notes = Wednesday, March 17, 2010, 3:30pm - 5:30pm [[BR]] Room: Presidents 2 [[BR]] == Introductions == 3:30pm All [[BR]] == Major WG issues and goals == 3:35pm Paul Barford (University of Wisconsin) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/barford%20%20gec7_imwg.pdf slides] (15min) [[BR]] Goals: [[BR]] + An infrastructure for gathering, analyzing and archiving measurements in GENI [[BR]] - Support for broad range of experiments [[BR]] - Tight integration with control framework(s) [[BR]] - Efficient, easy to use, broadly deployed, diverse capabilities, secure, privacy-aware, etc. [[BR]] + WG facilitates communication and coordination between I & M projects [[BR]] - Toward reaching overall goals quickly and effectively [[BR]] - Includes architectures, specifications, partnerships [[BR]] Issues:[[BR]] + Architectural (v0.1 on wiki)[[BR]] - Use cases[[BR]] - Schema[[BR]] - Sensors, components and protocols[[BR]] - UI and integration[[BR]] - Authentication and privacy[[BR]] - torage and analysis[[BR]] + Practical[[BR]] - Test and evaluation[[BR]] - Deployment, configuration and support[[BR]] == Project Reviews == Each of the following speakers was asked to include in their talk a review of how they address the following GENI I&M architecture priority topics: [[BR]] + Common terminology; best granularity of functions [[BR]] + Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data? [[BR]] + Measurement plane; options; expect nodes with 3 or 2 NICs [[BR]] === Instrumentation Tools Project (1642) === 3:50pm Jim Griffioen (University of Kentucky) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/griffioen%20%20instools_imwg_gec7.pdf slides] [[BR]] Goals:[[BR]] + Integrate Univ of Kentucky Emulab into ProtoGENI (completed in year 1)[[BR]] + Reimplement Univ of Kentucky Edulab instrumentation and measurement tools to work in the ProtoGENI environment.[[BR]] + Support automatic generation of instrumentation and measurement infrastructure on a per-slice basis.[[BR]] Structure:[[BR]] + Measurement Points[[BR]] + Measurement Controller[[BR]] Functional Components[[BR]] 1. Setup: deploy and initialize topology-specific software and services[[BR]] 2. Capture: capture measurement data[[BR]] 3. Collection: move data to processing/storage environments[[BR]] 4. Storage: store data on a temporary, short term, long term, and archival basis[[BR]] 5. Processing: filter, convert, aggregate, summarize, etc., data[[BR]] 6. Presentation: present data to users in meaningful ways[[BR]] 7. Access Protection: protect resources and data[[BR]] 8. Measurement Control: Dynamically control the above components[[BR]] Note: Conventional network management solutions exist for 2. through 8.[[BR]] Mapping functions to proposed services:[[BR]] 1. Setup – MO?[[BR]] 2. Capture - MP[[BR]] 3. Control – MO?[[BR]] 4. Collection - MC[[BR]] 5. Storage - MDA[[BR]] 6. Processing - MAP[[BR]] 7. Access Control – MO?[[BR]] 8. Presentation - MAP[[BR]] Measurement data schema:[[BR]] + Data-specific formats will exist at multiple times and places but should be invisible to the generic measurement infrastructure[[BR]] + Should leverage expertise of those with experience in this area (DatCat, PerfSONAR, etc).[[BR]] Measurement plane:[[BR]] + Virtual interfaces abound in GENI. [[BR]] - Unknown how virtual interfaces map to physical NICs.[[BR]] + Shared paths are OK, but require QoS.[[BR]] - Setting up QoS planes is challenging; particularly if planes are per-slice.[[BR]] === OMF/OML Project (1660) === 4:05pm Max Ott (NICTA) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/ott%20%20OML%20-%20GEC7%20-%20march.pdf slides] [[BR]] Goals:[[BR]] • All experiment output in one place[[BR]] • Capturing everything – most importantly meta data[[BR]] • Separation of concerns[[BR]] – Instrumenting[[BR]] – Collecting[[BR]] • Minimizing measurement collection overhead[[BR]] – Time[[BR]] – Traffic interference[[BR]] • Support for steerable experiments[[BR]] – Access to data in different places[[BR]] Concepts:[[BR]] + MPoints[[BR]] + Filters[[BR]] – Stddev[[BR]] – Average[[BR]] – First[[BR]] – Histogram[[BR]] + Processing/Caching[[BR]] + Steering or Feedback[[BR]] + MStreams[[BR]] + Dynamic Schema[[BR]] + Visualization[[BR]] Supported applications:[[BR]] • Traffic Generation/Measurements[[BR]] – OTG … Traffic Generator[[BR]] – Iperf[[BR]] • Monitoring[[BR]] – Libtrace[[BR]] – Libsigar[[BR]] – Spectrum Analyzer[[BR]] – GPS[[BR]] – (Weather)[[BR]] • Components[[BR]] – TinyOS/Motes[[BR]] – (GnuRadio)[[BR]] === LAMP using perfSONAR (1788) === 4:20pm Guilherme Fernandes (for Martin Swany) (University of Delaware) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/fernandes%20%20swany%20%20GEC7-LAMP.ppt slides] [[BR]] + perfSONAR is a multi-domain performance monitoring framework, which defines a set of protocol standards for sharing data between measurement and monitoring systems [[BR]] Architecture: [[BR]] + Interoperable network measurement middleware designed as a Service Oriented Architecture (SOA): [[BR]] - Components are Web Services (WS) based [[BR]] + Several unique components and design considerations, all of which operate in a cooperative yet independent manner [[BR]] - Each functionality is separated into a specific function [[BR]] - Clients and servers interact through scripted, XML Based protocols [[BR]] - Measurement data is encoded in expressive XML formats [[BR]] Components: [[BR]] + Infrastructure [[BR]] - Lookup Service [[BR]] - Topology Service [[BR]] - Authentication Service [[BR]] + Services [[BR]] - Measurement Point (MP) Service [[BR]] - Measurement Archive (MA) Service [[BR]] - Resource protector [[BR]] + Analysis and visualization [[BR]] Open protocols and schema: [[BR]] + Base network measurement schema[[BR]] - OGF Network Measurement Working Group[[BR]] + Topology Schema[[BR]] - OGF Network Markup Language (NML-)WG[[BR]] - Includes Topology Network ID[[BR]] + perfSONAR Protocol Documents[[BR]] - OGF Network Measurement and Control (NMC-)WG[[BR]] Base network measurement schema:[[BR]] + Measurement Data, a set of measurement events that have some value or values at a particular time[[BR]] + Measurement Metadata, the details about the set of measurement data[[BR]] Measurement metadata:[[BR]] + Subject (Noun)[[BR]] - The measured/tested entity (who)[[BR]] - E.g. A pair of hosts (end-point-pair), or a Layer 3 interface[[BR]] + EventType (Verb)[[BR]] - What type of measurement, value, or event occurred[[BR]] - Characteristic, tool output, or generic event[[BR]] - E.g. latency, bandwidth, utilization, or simply iperf[[BR]] + Parameters (Adjectives and Adverbs)[[BR]] - How, or under what conditions, did this event occur?[[BR]] - E.g. buffer sizes used, TCP vs ICMP packets[[BR]] + Key[[BR]] - Shortcut substituted in place of previous three items[[BR]] - No predefined format[[BR]] Measurement data: + Datum: The actual result (values) of measurement. [[BR]] - Can contain time (e.g. a Time element or attribute).[[BR]] - Existence of an event might point to the case where there no additional value[[BR]] - As in “Link up/down” or threshold events[[BR]] + Time: Representation of a time stamp or time range in a specified format.[[BR]] - Must be extensible since even agreement about the right structure is not easy, e.g. UNIX timestamp vs NTP time[[BR]] Schema namespaces and extensibility:[[BR]] + A namespace: http://ggf.org/ns/nmwg/base/2.0/[[BR]] - MAY NOT be a URL[[BR]] + We encode the measurement/event type in the namespace (and as a standalone element)[[BR]] + We use Data and Metadata elements and vary the namespaces of the specific elements[[BR]] + Extensibility achieved through hierarchy with delegation[[BR]] - Similar to OIDs in the IETF management world[[BR]] + The NM-WG has a hierarchy of network characteristics[[BR]] - Good starting point[[BR]] - E.g. http://ggf.org/ns/nmwg/characteristic/utilization/2.0[[BR]] - E.g. http://ggf.org/ns/nmwg/characteristic/bandwidth/achievable/2.0[[BR]] + However, not all tools are cleanly mapped onto the Characteristic space[[BR]] - Often a matter of some debate[[BR]] + Organization-rooted tools namespace addresses this[[BR]] - Easy to add new tools in organization-specific namespaces[[BR]] - E.g. http://ggf.org/ns/nmwg/tools/nuttcp/2.0[[BR]] Topology Schema: [[BR]] + Topology schema grew from network measurement description [[BR]] - Reusable “Subject” elements for common cases [[BR]] - Also reduces redundancy [[BR]] - Relationships between measurement Subjects [[BR]] + Structured by layers and the same elements recurring there (Base, L2, L3, L4) [[BR]] - networks as graphs[[BR]] + Elements:[[BR]] - Domain[[BR]] - Node[[BR]] - Port[[BR]] - Link[[BR]] - Network[[BR]] - Path[[BR]] - Service[[BR]] + Varied by namespaces (extensibility)[[BR]] - Reuse visualization logic, etc.[[BR]] - Validate layer- or technology-specific attributes[[BR]] + Used by perfSONAR, IDC Protocol (ION, OSCARS, AutoBAHN), Phoebus[[BR]] - Currently calling it the UNIS Topology Schema [[BR]] + OGF NML-WG to unify NDL and UNIS Topology schema[[BR]] - Happening as we speak at OGF28[[BR]] LAMP objectives:[[BR]] + Collaborate on defining a common but extensible format for data storage and exchange for GENI I&M systems[[BR]] - Use perfSONAR NM-WG schema as starting point[[BR]] - Identify new characteristics/tools namespaces[[BR]] + Develop a representation of GENI topology to be used to describe measurements and experiment configuration[[BR]] - UNIS topology schema can be easily extended [[BR]] + Collaborate with related GENI measurement and security projects on a common GENI I&M architecture[[BR]] - The new GENI I&M Arch. Draft defines very similar services (MP, MC, MDA, MAP), and new ones (MO)[[BR]] - perfSONAR is a good starting point, not currently a final solution (for GENI); [[BR]] - Use cases have been different, but much can be reused and the framework can be extended[[BR]] === DatCat Project === 4:35pm Brad Huffaker (CAIDA, UC San Diego Supercomputing Center) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/huffaker%20%20dat%20cat%20%20%20genie-gec07-201003.pdf slides] [[BR]] Goal: [[BR]] + DatCat was designed to improve data sharing by providing a unified metadata database for Internet data. [[BR]] + Make easy for users: [[BR]] - finding data sets of interest [[BR]] - adding new data sets to the catalog [[BR]] - annotating data sets in the catalog [[BR]] + DOES NOT store data [[BR]] Database scheme: [[BR]] + Collection [[BR]] - logical group of files(paper, project,...) [[BR]] + Data files [[BR]] - raw data files (traceroutes, logs dumps, ...) [[BR]] + Packages [[BR]] - downloadable files (single file, tarball, ...) [[BR]] + Locations [[BR]] - how to get the packages (URL, contact address, ...) [[BR]] Annotations: [[BR]] + Provide an extensible naming space for assigning domain specific values to files. [[BR]] + each user has their own hierarchical name space [[BR]] - passive.IPv4.packet_count [[BR]] - active.RTT_95th_percentile [[BR]] + both data contributors and general DatCat users may attach annotations [[BR]] + any user may assign “note” annotations to any object [[BR]] Metadata fields: [[BR]] + collection [[BR]] - fields: name, contents, summary, motivation, creators/primary contact/contributor, start/end time, keywords, short description/description/description URL [[BR]] - annotations: note [[BR]] + data [[BR]] - fields: name, creators/primary contact/contributor, keywords,format, file size, start/end time, duration, geographic/network location, time zone, MD5, description, creation process [[BR]] - annotations: passive.IPv4.packet_count, passive.IPv4.TCP.dst.port_count, cfg.passive.capture_len, AS_count, active.trace_count, active.RTT_10th_percentile, ..... [[BR]] + location [[BR]] - fields: package, creators, primary contact, status, download procedure, download URL, geographic/logistic location, availability [[BR]] Submission tools: [[BR]] + Perl API [[BR]] - useful for integrating into existing data management systems [[BR]] - flexible, but need to write code: [[BR]] + subcat [[BR]] - different approach (declarative) [[BR]] - describe metadata in human-friendly text files (YAML) [[BR]] - CAIDA provides tools to extract additional metadata (data-to-yaml) [[BR]] - subcat intuitively joins information together [[BR]] DatCat web portal: [[BR]] + Browse collections [[BR]] + Search collections [[BR]] + Search data [[BR]] Lessons learned: [[BR]] + file-level metadata hard [[BR]] - hard to fix errors across thousands of files [[BR]] - hard to display thousands of files [[BR]] - hard to generate [[BR]] + submission process too cumbersome for most users [[BR]] - majority of metadata is shared between files, creator, creation process, location, etc [[BR]] - many researchers are not programmers [[BR]] - researchers have limited time and motivation [[BR]] + Lots of redundant information: [[BR]] - For a single contribution, a majority of data objects have identical metadata shared across a large number of data objects. [[BR]] - could be solved by pushing subcat-type categories into the database [[BR]] + Move to stand-alone collections [[BR]] - contributors will only need to fill in the collection information [[BR]] - shorten search path from collection to locations [[BR]] + better to have lots of collections, than lots of files [[BR]] == GENI I&M Architecture == 4:55pm Harry Mussman (GPO) [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/031210%20%20IM-ARCH-GEC7Slides.pdf slides] [[BR]] [http://groups.geni.net/geni/wiki/GeniInstrumentationandMeasurementsArchitecture GENI I&M Architecture document] (15min) [[BR]] v0.1 DRAFT includes proposed I&M services and proposed configuration [[BR]] Purpose: Provide a comprehensive and ordered list of topics that must be addressed for a complete architecture Identify the priority topics that the WG needs to address first Pull together contributions by the WG though Spiral 2 Plan: Now : v0.1 DRAFT completed, by GPO; see http://groups.geni.net/geni/wiki/GeniInstrumentationandMeasurementsArchitecture By GEC8: v0.5 draft, by GPO, with contributions from WG By GEC9: v1.0 draft, reviewed by WG Document outline: Document Scope 2. Introduction 3. Definition and configuration of I&M services 4. Interfaces, protocols and schema for Measurement Data (MD) 5. Ownership of MD and privacy of owners 6. Interfaces, protocols and APIs for using I&M services 7. Basic GENI I&M use cases 8. MD transport via the GENI Measurement Plane 9. Discovery, authorization, assignment and binding of GENI I&M services 10. Measurement Orchestration (MO) service 11. Measurement Point (MP) 12. Time-stamping MD 13. Measurement Collection (MC) service 14. Measurement Analysis and Presentation (MAP) service 15. Measurement Data Archive (MDA) service 16. Additional GENI I&M use cases Based on GENI I&M Capabilities Catalog (v0.1), these GENI projects have comprehensive, end-to-end capabilities: OML (ORBIT Measure Library) in OMF (ORBIT Mgmt Framework) (Ott, NICTA and Gruteser, WINLAB/Rutgers, 1660) Instrumentation Tools (Griffioen, Univ Kentucky, 1642) perfSONAR for network measurements (Zekauskas, I2 and Swany, Univ Delaware, 1788) Scalable Sensing Service (Fahmy, Purdue and Sharma, HP Labs, 1723) OnTimeMeasure (Calyam, Ohio Super Ctr, 1764) After considering projects with comprehensive, end-to-end capabilities, here are five services they have in common: Measurement Orchestration (MO) service (p/o Experiment Control service, uses a language to orchestrate I&M services) Measurement Point (MP) service (instrumentation that taps into a network and/or systems, links and/or nodes, to capture measurement data and format it using a standardized schema) Measurement Collection (MC) service (programmable systems that collect, combine, transform and cache measurement data) Measurement Analysis and Presentation (MAP) service (programmable systems that analyze and then present measurement data) Measurement Data Archive (MDA) service (measurement data repository, index and portal) Expected range of implementations: Small-scale implementations might put all I&M services within one aggregate, and even in one server interfaces between services would be internal to the aggregate, or even internal to the server Large-scale implementations might have I&M services distributed over many aggregates with measurement data flowing between services with orchestration mechanisms based upon message exchanges Discussion topics: Are these five services a complete group of I&M services? Are these good names for the five I&M services? Is this five the right granularity for I&M services? Is this a complete and flexible configuration for I&M services? Can this configuration accommodate the range from small-scale to large-scale implementations? How can we obtain a consensus, so that we can set a firm foundation for the other topics? Interfaces, protocols and schema for measurement data: Issues: This topic suggested at GEC6 meeting: Common schema for MD Can we identify a common set of interfaces, protocols and schema for MD, or at least a limited number of types? What needs to be included in the MD schema? Approach: Assume all MD after MPs follows this common set of interfaces, protocols and schema Start with definition of MD schema Next, understand [8. MD Transport via GENI Measurement Plane] Then, complete first set of interfaces and protocols From GENI I&M Capabilities Catalog (v0.1), these GENI projects (and others) are working on data schema and/or data archives: perfSONAR for network measurements (Swany, Univ Delaware, 1788) IMF project (Dutta, NC State, 1718) Embedded Real-Time Measurements (Bergman, Columbia, 1631) GENI Meta-Operations Center (Herron, Indiana Univ, 1604) netKarma: GENI Provenance Registry (Pale and Small, Indiana Univ, 1706) DatCat project at http://www.datcat.org/ (Klaffy, CAIDA) Crawdad project at http://crawdad.cs.dartmouth.edu/ (Kotz, Dartmouth) Amazon Simple Storage Service Data-Intensive Cloud Control (Zink and Cecchet, UMass Amherst, 1709 ) Experiment Mgmt System (Lannom and Manepalli, CNRI, 1663) others? What can we learn from these projects? Discussion topics: Standardized interfaces between measurement services Pt-to-pt vs pt-to-multipoint (e.g., pub/sub) Stream vs bulk transfer Disconnection operation expected, or not. Protocols for moving measurement data Streaming data Bulk-transfer of data Schema for measurement data Data record identifier Annotation, or meta data Data types and values, with timestamps How can we obtain a consensus on first set of intfc’s/protocols/schema for MD? What is the process for extending the set? GENI measurement plane: Issue: Need to understand how MD traffic flows are transported by the GENI Measurement Plane before the interfaces and protocols for MD can be fully defined Approach: Understand current view of GENI Control Plane and Experiment Plane Consider options for GENI Measurement Plane to transport MD flows, using networks that implement GENI Control and Experiment Planes Priority topics: [[BR]] + Common terminology; best granularity of functions [[BR]] + Measurement data schema; common after MPs, before and within MCs, MDAs; what is included in meta-data? [[BR]] + Measurement Plane; options; expect nodes with 3 or 2 NICs [[BR]] == Next Steps for WG == 5:10pm Bruce Maggs (Duke University) [[BR]] == Wrap-up, review of action items and issues for planary == 5:25pm Harry Mussman (GPO) [[BR]] [http://groups.geni.net/geni/attachment/wiki/Gec7InstMeasWGAgenda/031710%20WGInstAndMeas_Report.ppt slides] [[BR]] 5:30pm Adjourn [[BR]] 6:30pm BoF dinner, organized by Harry Mussman, location [http://www.ghgrestaurants.com/parizade/parizademaster.html Parizade Restaurant][[BR]]