This draft of an MDOD is an interpretation based on the initial draft MDOD presented by Harry Mussmann at GEC11 plus suggested modifications by IU. This draft version has also been influenced by comments on the MDOD from the Instrumentation and Measurement group at GEC11, draft MDODs by Jason Zurawski, and documentation from the IF-MAP group. Version 0.2 of this schema has also been influenced by discussions at GEC13 with the Instrumentation and Measurement group, well as general feedback and summary of MDOD status and issues by Giridhar Manepalli (CNRI) at GEC13, and the NetKarma provenance repository and schema. author: Scott Jensen, Indiana University Should the MDOD be a single measuremet stream (e.g., from a single MP), a collection of measurements, or measurements plus derived products such as analysis or presentations prepared based on transformations of the measurement data. Initial drafts of the MDOD described as all of the measurements for an experiment. This draft assumes that the MDOD is a collection of measurements, provenance, and derived products or transformations of the measurement data, that describe an experiment or related set of experiments. There could be multiple data objects in the MDOD that are derived from subsets of the measurement data where some subsets overlap, but the relationships between data objects are not strictly a tree (i.e., an MDO could have multiple results derived from it within the MDOD). This schema currently has 5 top-level elements, but only minimal identification is required: identification: This is the identification for the MDOD as a collection. provenance: This is an OPM provenance graph - such as the graph generated by NetKarma for an experiment. For an aggregate this can represent how the MDOD itself is created. security: optional element for setting policies. Can also be set at the underlying dataDescriptors or inheritied there from the MDOD level. dataDescriptor: There would be a seperate instance of this element for each MDO or derived data product described within the MDOD. mdodReference: An MDOD could include other MDODs by reference either locally or based on a URL. When an MDOD represents a bundle that is archived and shared it would be assigned a DOI identifier. Otherwise it can have an internal ID. For an internal ID, the format could be required to be consistent with the draft at GEC11 where an ID follows the format: domain:subdomain+object_type+object_name The dataDescriptor is for measurement or other data objects local to the MDOD that are not stored or accessible from their own MDOD. The data descriptor could represent data stored in another location. The dataDescriptor maps to the descriptor in the draft MDOD version 0.2.1 All of the content of the descriptorSecurity element is optional. Should there instead be only nested MDODs? Both the MDOD itself and the descriptor have identification elements. The identification section within the data descriptor would pertain to a single data object whereas the MDOD identification section relates to the MDOD as a whole, which can represent a set of measurements and derived data products. Since the dataDescriptor's identification should be populated automatically if possible (users do not enter metadata), the abstract and subject are moved to the MDOD level and eliminated here. The original MDOD had the type and value separate with path, url, and other as the types and text for the value. Here they are a choice and type is indicated by which element is used. Contact was made optional. If the data being described is local, the contact could be redundant. Is the scope necessary? a path would be local, and any other value would be external? Are locators other than paths or URLs needed? The original MDOD schema had three options for the scope of the locator: global, per_association, and within_holder. Are these needed? If path based it's local and if URL based it would be global. Is there a need for other alternatives such as association? In this draft the policy and method elements are sourced strings. This approach would accomodate standardized policies within GENI that could be specified based on a controlled vocabulary, but the ability to express more complex policies may be desirable. The MDOD can describe both measurements and transformations such as the analysis or a presentation generated from the measurement data, or even an external publication of the results. If these different types are described by fundementally different metadata, the dataDescription should contain additional alternatives other than the measurement event or analysis event. A prior version of the measurement event was based on the MDOD version 0.2.1 draft discussed at GEC11. There was discussion as to whether it should be extended to capture different measurement tools and vendor extentions that allow for future development. Initial versions included more detailed elements such as flowrate and size. The MDOD and dataDescriptor should describe what was captured, not the measurements themselves, so do we need that level of detail? Interpretation method is included as a string based on a controlled vocabulary which is the source. Do we need to extend this further to be machine readable? Would the interpretation method need to be able to specify configuration parameters? The analysis even would capture derived products such as an analysis based on measurement data or a presentation. The path based reference should include the ID of the referenced MDOD. Is there an ID available to identify users in GENI? Should there be a project ID or other association? Do we need an enum for type of contact (e.g., user, operator, aggregate provider) Policies in the draft MDOD from GEC11 had an enum as to whether there is a policy and an accommpanying optional description. Instead of a description, should users be able to provide a URL where the poliiy is? The use of a URL to the policy instead of the policy itself. A version element is included in case a policy URL only reflects the current policy, or contains multiple versions of a policy. The sourced string type allows keywords, data types, and other elements in the schema to be specified based on cotrolled vocabularies created by communities internal or external to GENI. To the extent that a value is based on a controlled vocabulary, the defining source, prefferably a resolvable URI, would be included. If not based on a controlled vocabulary, should the attribute instead be optional or should the value "NONE" be used as the source.