wiki:GENIMetaOps/DraftMonitoringMetrics

Version 3 (modified by chaos@bbn.com, 12 years ago) (diff)

--

Note: this is a draft recommendation of relations and metrics for operational monitoring.

Metrics to report as of GEC14 monitoring demo

This page is an example of all new metrics we want to be reporting as of the GEC14 monitoring demo, and who will be reporting them.

Note: where parameters are reported with empty values, that is intended to demonstrate that SAs/AMs will eventually be required to provide those values, but i believe current SAs/AMs may not be able to provide the information.

General principles and naming

Relations reported by various entities

Relations for aggregates

  • Primary element: an aggregate's relational data should be contained in an <aggregate> element, with the following mandatory attributes:
    • name: the name of the aggregate host (e.g. an FQDN)
    • type: what type of aggregate is this? GMOC will probably need to be aware of all valid values of this attribute, though new values may be able to do some useful things by default. In this example, we use the aggregate type values: myplc, foam.
    • location: the physical location containing this aggregate. Metadata about this location should be submitted to GMOC some other way (not within the <aggregate> element).
    • organization: the organization which primarily maintains this resource. Again, metadata about this organization should be submitted to GMOC some other way (not within the <aggregate> element).
  • Contents:
    • Each sliver which is defined on this aggregate should be named in a <sliver> element, which has mandatory attributes:
      • name: The aggregate's name for this sliver; this is unique at a particular point in time, but need not be unique over time.
      • uuid: The aggregate's UUID for this sliver; this is unique over time, but may also be "" if the aggregate does not yet define UUIDs for slivers.
      • created: The timestamp at which this sliver was created.
      • expires: The timestamp at which this sliver will expire.
      • slice_urn: The URN of the slice to which this sliver belongs; URN's are unique at a particular point in time, but need not be unique over time.
      • slice_uuid: The UUID of the slice to which this sliver belongs; UUIDs are unique over time, and are provided to the aggregate by the slice as part of the slice credential. This value may be "" if the aggregate does not yet support storing and reporting the slice UUID. In addition, the <sliver> element can contain zero or more of the following element contents:
      • A mapping between a sliver and a resource is defined by a <resource_mapping> element, which has the following mandatory attributes:
        • resource: The name of the resource which contains the mapping from this sliver. This must be defined in a <resource> element, and should be a resource which lists this aggregate as its aggregate attribute.
        • name: The name of this sliver on that resource. This may be the same or different from the name of the sliver on the aggregate. The purpose of this field is to allow the GMOC UI to map time series information reported by the resource, back to the sliver, even if the resource is not GENI-aware and does not know the sliver's name.
        • type: What type of thing has the sliver allocated on this resource. Again, GMOC should probably be aware of all valid values of this attribute. In this example, we use the resource_mapping types: vm

Relations for resources

  • Primary element: a resource's relational data should be contained in a <resource> element, with the following mandatory attributes:
    • name: the name of the resource (e.g. an FQDN)
    • type: what type of resource is this? GMOC will probably need to be aware of all valid values of this attribute, though new values may be able to do some useful things by default. In this example, we use the resource types: vmserver
    • aggregate: what aggregate manager controls this resource? This aggregate must be defined by an <aggregate> element somewhere.
    • location: the physical location containing this resource. Metadata about this location should be submitted to GMOC some other way (not within the <resource> element).
    • organization: the organization which primarily maintains this resource. Again, metadata about this organization should be submitted to GMOC some other way (not within the <resource> element).
  • Each interface which is defined on this resource should be named in an <interface> element, which has mandatory attributes:
    • name: the name of the interface on the resource. Must be unique on this resource. and optional attributes:
    • macaddr: the ethernet address of this interface (if it has a unique ethernet address)
    • vlan: the VLAN tag, or comma-separated list of tags, which the resource adds to traffic sent out this interface (if this is a VLAN subinterface or trunk interface)
    • parent: the parent interface name (if this is a virtual subinterface on the resource)

Metrics reported for PlanetLab AMs

Host: hegen.gpolab.bbn.com (GPO production MyPLC):

  • Relations:
    <aggregate type="myplc" name="hegen.gpolab.bbn.com" location="gpolab" organization="BBN">
      <sliver name="pgenigpolabbbncom_plastic104" uuid="" created=1305301475 expires=1337137200 slice_urn="urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+plastic-104" slice_uuid=""/>
        <resource_mapping resource="bain.gpolab.bbn.com" type="vm" name="pgenigpolabbbncom_plastic104"/>
        <resource_mapping resource="navis.gpolab.bbn.com" type="vm" name="pgenigpolabbbncom_plastic104"/>
      </sliver>
    </aggregate>
    
  • Time series values:
Type Name Tags Columns Units
cpu_info cpu_idle percent

Host: bain.gpolab.bbn.com (GPO production plnode):

  • Relations:
    <resource type="vmserver" name="bain.gpolab.bbn.com" aggregate="hegen.gpolab.bbn.com" location="gpolab" organization="BBN">
      <interface name="eth0" macaddr="00:1B:21:5F:8F:E0" />
      <interface name="eth1" macaddr="00:1B:21:5F:8F:E1" />
      <interface name="eth1.1750" vlan="1750" parent="eth1" />
      <interface name="eth1.1734" vlan="1734" parent="eth1" />
    </resource>
    
  • Time series values:
Type Name Tags Columns Units
cpu_info cpu_idle percent
network_info interface:eth0 rx_packets_sec,rx_bits_sec,tx_packets_sec,tx_bits_sec pps,bps,pps,bps
network_info sliver:pgenigpolabbbncom_plastic104 rx_packets_sec,rx_bits_sec,tx_packets_sec,tx_bits_sec pps,bps,pps,bps