wiki:GENIMetaOps/DraftMonitoringMetrics

Version 16 (modified by chaos@bbn.com, 8 years ago) (diff)

add URN field for SAs and AMs

Note: this is a draft recommendation of relations and metrics for operational monitoring.

Metrics to report as of GEC14 monitoring demo

This page is an example of all new metrics we want to be reporting as of the GEC14 monitoring demo, and who will be reporting them.

Note: where parameters are reported with empty values, that is intended to demonstrate that SAs/AMs will eventually be required to provide those values, but i believe current SAs/AMs may not be able to provide the information.

General principles and naming

Relations reported by various entities

Relations for slice authorities

  • Primary element: an SA's relational data should be contained in an <sa> element, with the following mandatory attributes:
    • name: the name of the SA host (e.g. an FQDN)
    • urn: the GENI URN of this SA
    • type: what type of SA is this? SA functionality is the same regardless of what type of software implements the SA, so this piece of information is just for reference while debugging.
    • version: what version of the SA software is this SA running --- meaning of this value depends on the SA
    • location: the physical location containing this aggregate. Metadata about this location should be submitted to GMOC some other way (not within the <sa> element).
    • organization: the organization which primarily maintains this resource. Again, metadata about this organization should be submitted to GMOC some other way (not within the <sa> element).
  • Contents:
    • Each slice which is defined on this SA should be named in a <slice> element, which has mandatory attributes:
      • name: the URN of the slice, which should be unique in GENI at a given time
      • uuid: the UUID of the slice, which should be unique for this particular slice authority across time
      • created: the time when this slice was created
      • expires: the time when this slice will expire
      • creator_urn: the URN of the user who created the slice

Relations for aggregates

  • Primary element: an aggregate's relational data should be contained in an <aggregate> element, with the following mandatory attributes:
    • name: the name of the aggregate host (e.g. an FQDN)
    • urn: the GENI URN of this AM
    • type: what type of aggregate is this? GMOC will probably need to be aware of all valid values of this attribute, though new values may be able to do some useful things by default. In this example, we use the aggregate type values: myplc, foam.
    • version: what version of the AM software is this AM running --- meaning of this value depends on the AM type
    • location: the physical location containing this aggregate. Metadata about this location should be submitted to GMOC some other way (not within the <aggregate> element).
    • organization: the organization which primarily maintains this resource. Again, metadata about this organization should be submitted to GMOC some other way (not within the <aggregate> element).
  • Contents:
    • Each sliver which is defined on this aggregate should be named in a <sliver> element, which has mandatory attributes:
      • name: The aggregate's name for this sliver; this is unique at a particular point in time, but need not be unique over time.
      • uuid: The aggregate's UUID for this sliver; this is unique over time, but may also be "" if the aggregate does not yet define UUIDs for slivers.
      • created: The timestamp at which this sliver was created.
      • expires: The timestamp at which this sliver will expire.
      • approved: has this sliver been administratively approved? (This only makes sense for aggregates which have a concept of administrative approval of resource requests.)
      • state: What is the state of the sliver? (use a few "GENI AM API v3" options as valid values)
      • slice_urn: The URN of the slice to which this sliver belongs; URN's are unique at a particular point in time, but need not be unique over time.
      • slice_uuid: The UUID of the slice to which this sliver belongs; UUIDs are unique over time, and are provided to the aggregate by the slice as part of the slice credential. This value may be "" if the aggregate does not yet support storing and reporting the slice UUID.
      • creator_urn: the URN of the user who created the sliver. If this is empty, maybe assume it's no different from the slice creator. In addition, the <sliver> element can contain zero or more of the following element contents:
      • A mapping between a sliver and a resource is defined by a <resource_mapping> element, which has the following mandatory attributes:
        • resource: The name of the resource which contains the mapping from this sliver. This must be defined in a <resource> element, and should be a resource which lists this aggregate as its aggregate attribute.
        • name: The name of this sliver on that resource. This may be the same or different from the name of the sliver on the aggregate. The purpose of this field is to allow the GMOC UI to map time series information reported by the resource, back to the sliver, even if the resource is not GENI-aware and does not know the sliver's name.
        • type: What type of thing has the sliver allocated on this resource. Again, GMOC should probably be aware of all valid values of this attribute. In this example, we use the resource_mapping types: vm

Relations for resources

  • Primary element: a resource's relational data should be contained in a <resource> element, with the following mandatory attributes:
    • name: the name of the resource (e.g. an FQDN)
    • description: a text description of the resource (primarily to be used if the resource's name is something other than an FQDN, to describe what kind of thing it is)
    • type: what type of resource is this? GMOC will probably need to be aware of all valid values of this attribute, though new values may be able to do some useful things by default. In this example, we use the resource types: vmserver
    • aggregate: what aggregate manager controls this resource? This aggregate must be defined by an <aggregate> element somewhere.
    • location: the physical location containing this resource. Metadata about this location should be submitted to GMOC some other way (not within the <resource> element).
    • organization: the organization which primarily maintains this resource. Again, metadata about this organization should be submitted to GMOC some other way (not within the <resource> element).
  • Each interface which is defined on this resource should be named in an <interface> element, which has mandatory attributes:
    • name: the name of the interface on the resource. Must be unique on this resource. and optional attributes:
    • macaddr: the ethernet address of this interface (if it has a unique ethernet address)
    • vlan: the VLAN tag, or comma-separated list of tags, which the resource adds to traffic sent out this interface (if this is a VLAN subinterface or trunk interface)
    • parent: the parent interface name (if this is a virtual subinterface on the resource)

Metrics reported for slice authorities

Reporter: boss.pgeni.gpolab.bbn.com (GPO production ProtoGENI SA):

  • Relations:
    <sa name="pgeni.gpolab.bbn.com" urn="urn:publicid:IDN+pgeni.gpolab.bbn.com" type="protogeni" version="" location="gpolab" organization="BBN">
      <slice name="urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+plastic-104" uuid="18aaf8f4-7d77-11e0-b70c-000c29f89f7b" created=1305261529 expires=1337083200 creator_urn="urn:publicid:IDN+pgeni.gpolab.bbn.com+user+jbs" />
    </sa>
    

Metrics reported for PlanetLab AMs

Reporter: hegen.gpolab.bbn.com (GPO production MyPLC):

  • Relations:
    <aggregate type="myplc" name="hegen.gpolab.bbn.com" version="" location="gpolab" organization="BBN">
      <sliver name="pgenigpolabbbncom_plastic104" uuid="" created=1305301475 expires=1337137200 creator_urn="" slice_urn="urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+plastic-104" slice_uuid="" state="running">
        <resource_mapping resource="bain.gpolab.bbn.com" type="vm" name="pgenigpolabbbncom_plastic104"/>
        <resource_mapping resource="navis.gpolab.bbn.com" type="vm" name="pgenigpolabbbncom_plastic104"/>
      </sliver>
    </aggregate>
    
  • Time series values:
Type Name Tags Columns Units
cpu_info cpu_idle percent

Reporter: bain.gpolab.bbn.com (GPO production plnode):

  • Relations:
    <resource type="vmserver" name="bain.gpolab.bbn.com" aggregate="hegen.gpolab.bbn.com" location="gpolab" organization="BBN" description=>
      <interface name="eth0" macaddr="00:1B:21:5F:8F:E0" />
      <interface name="eth1" macaddr="00:1B:21:5F:8F:E1" />
      <interface name="eth1.1750" vlan="1750" parent="eth1" />
      <interface name="eth1.1734" vlan="1734" parent="eth1" />
    </resource>
    
  • Time series values:
Type Name Tags Columns Units
cpu_info cpu_idle percent
network_info interface:eth0 rx_packets_sec,rx_bits_sec,tx_packets_sec,tx_bits_sec pps,bps,pps,bps
network_info sliver:pgenigpolabbbncom_plastic104 rx_packets_sec,rx_bits_sec,tx_packets_sec,tx_bits_sec pps,bps,pps,bps

Metrics reported for FOAM AMs

Reporter: tulum.gpolab.bbn.com (GPO production FOAM):

  • Relations:
    <aggregate type="foam" name="tulum.gpolab.bbn.com" location="gpolab" organization="BBN">
      <sliver name="urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+plastic-104:37133631-1210-4787-91d3-7e4dfba7cce1" uuid="37133631-1210-4787-91d3-7e4dfba7cce1" created= expires=1337137200 slice_urn="urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+plastic-104" slice_uuid="" creator_urn="" approved="true" state="enabled">
        <resource_mapping resource="06:d6:00:24:a8:c4:b9:00" type="flowspace" name="37133631-1210-4787-91d3-7e4dfba7cce1"/>
      </sliver>
    </aggregate>
    
  • Time series values:
Type Name Tags Columns Units
cpu_info cpu_idle percent

Reporter: 06:d6:00:24:a8:c4:b9:00 (GPO OpenFlow-controlled datapath):

  • Relations:
    <resource type="datapath" name="06:d6:00:24:a8:c4:b9:00" aggregate="tulum.gpolab.bbn.com" location="gpolab" organization="BBN" description="habanero.gpolab.bbn.com[vl1750]">
    </resource>
    
  • Time series values:
Type Name Tags Columns Units
of_ctrl_network_info rx_messages_sec,tx_messages_sec mps,mps
of_ctrl_network_info sliver:37133631-1210-4787-91d3-7e4dfba7cce1 rx_messages_sec,tx_messages_sec mps,mps

Metrics reported for a non-GENI test resource

Reporter: iolkos.gpolab.bbn.com (GPO dedicated network pingtest host):

  • Relations:
    <resource type="host" name="iolkos.gpolab.bbn.com" aggregate="" location="gpolab" organization="BBN" description="">
      <interface name="eth0" macaddr="BC:30:5B:D1:A3:E9" />
      <interface name="eth1" macaddr="BC:30:5B:D1:A3:EA" />
      <interface name="eth1.3745" vlan="3745" parent="eth1" />
      <interface name="eth1.3746" vlan="3746" parent="eth1" />
    </resource>
    

Metrics reported by an end-to-end test

Reporter: ashur.gpolab.bbn.com (AM test host):

  • Relations: none
  • Time series values:
Type Name Tags Columns Units
myplc_am_status aggregate:hegen.gpolab.bbn.com getversion,listresources nagios_health,nagios_health