Version 29 (modified by 10 years ago) (diff) | ,
---|
Data needed to meet operational monitoring use cases
Data to meet use cases
The details of which data types are needed to meet which use cases, why that data was proposed, and any open questions about the data, are now listed at OperationalMonitoring/DataForUseCases.
Proposed data schema
I propose a schema based on, and partially compatible with, http://unis.incntre.iu.edu/schema/20120709/, for measurement data and metadata. In particular, i suggest using:
- http://www.gpolab.bbn.com/monitoring/schema/20140131/opsconfig: (inherits from http://unis.incntre.iu.edu/schema/20120709/domain): operational configuration metadata, including locations of and information about other datastores
- http://www.gpolab.bbn.com/monitoring/schema/20140131/aggregate: (inherits from http://unis.incntre.iu.edu/schema/20120709/domain): metadata about an aggregate (including lists of resources and slivers at that aggregate, and where to look for measurements about the aggregate)
- http://unis.incntre.iu.edu/schema/20120709/node: metadata about a resources (including resource properties termed "config" above, and a list of relevant ports (or of all ports, whichever is easier) on that resource)
- http://unis.incntre.iu.edu/schema/20120709/port: metadata about a network interface (including resource properties termed "config" above)
- http://www.gpolab.bbn.com/monitoring/schema/20140131/sliver: (inherits from http://unis.incntre.iu.edu/schema/20120709/networkresource): metadata about a sliver at an aggregate, including resources to which that sliver is mapped
- http://www.gpolab.bbn.com/monitoring/schema/20140131/authority: (inherits from http://unis.incntre.iu.edu/schema/20120709/domain): metadata about a GENI authority (including lists of GENI users and slivers at that authority)
- http://www.gpolab.bbn.com/monitoring/schema/20140131/slice: (inherits from http://unis.incntre.iu.edu/schema/20120709/networkresource): metadata about a slice at an authority, including GENI users with roles on that slice
- http://www.gpolab.bbn.com/monitoring/schema/20140131/user: (inherits from http://unis.incntre.iu.edu/schema/20120709/networkresource): metadata about a GENI user at an authority, including contact information
- http://www.gpolab.bbn.com/monitoring/schema/20140131/data: novel schema based on a combination of http://unis.incntre.iu.edu/schema/20120709/metadata and http://unis.incntre.iu.edu/schema/20120709/tsdatum, contains both metadata and data about measurements.
- For now, we would use the
ops_monitoring
namespace for operations monitoring, meaning:- When we add monitoring-relevant optional properties to objects, we'll put them in an
ops_monitoring
dictionary - When we setup operations monitoring measurements, we'll give them the eventType
ops_monitoring:<something>
- When we add monitoring-relevant optional properties to objects, we'll put them in an
Data schema usage example
Some examples usages of the above schemas to encode metadata and data needed for use cases 3 and 6 follow.
These examples assume the following (fictitious) local datastore URLs. These are arbitrary, and any place they appear, they can be replaced by whatever name the deployers prefer. For simplicitly, i've shown one local datastore per aggregate here, but the architecture does not require that --- it would be perfectly fine to have one datastore for relational metadata and one for the data itself, or one for certain types of relational metadata and one for others.
https://datastore.geni.net/
: datastore containing configuration data for operational monitoringhttps://datastore.instageni.gpolab.bbn.com/
: datastore for gpo-ig aggregatehttps://datastore.ch.geni.net/
: datastore for ch.geni.net authority
Data about operational monitoring configuration
Operational monitoring configuration data tells aggregators where to find local datastores, and includes relevant metadata (like URNs and metadata about datastore or aggregate types) that aggregators can use to decide which datastores to query. It is described using the opsconfig schema. Examples:
- geni-prod: a hypothetical config datastore listing production aggregates and authorities:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/opsconfig#", "id": "geni-prod", "selfRef": "https://datastore.geni.net/opsconfigs/geni-prod", "ts": 1391192685740849, "aggregates": [ { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+authority+cm", "amtype": "protogeni", "href": "https://datastore.instageni.gpolab.bbn.com/aggregates/gpo-ig" } ], "authorities": [ { "urn": "urn:publicid:IDN+ch.geni.net+authority+ch", "href": "https://datastore.ch.geni.net/authorities/ch.geni.net" } ] }
Data about an aggregate
Aggregates are indexed by GENI-agreed short name and described using the aggregate schema. Examples:
Call for GPO-IG:
https://datastore.instageni.gpolab.bbn.com/aggregate/gpo-igResponse:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/aggregate#", "id": "gpo-ig", "selfRef": "https://datastore.instageni.gpolab.bbn.com/aggregate/gpo-ig", "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+authority+cm", "ts": 1391192685740849, "measRef": "https://datastore.instageni.gpolab.bbn.com/data", "resources": [ { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+node+pc1", "href": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1" }, { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+node+pc2", "href": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc2" } ], "slivers": [ { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+sliver+26947", "href": "https://datastore.instageni.gpolab.bbn.com/slivers/instageni.gpolab.bbn.com_sliver_36947" } ] }
Data about a node
Nodes have an ID which is a URL-sanitized version of their URN and are described using the node schema. Examples:
Call for GPO-IG PC1:
https://datastore.instageni.gpolab.bbn.com/node/instageni.gpolab.bbn.com_node_pc1Response:
{ "$schema": "http://unis.incntre.iu.edu/schema/20120709/node#", "id": "instageni.gpolab.bbn.com_node_pc1", "ts": 1391192705275101, "selfRef": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+node+pc1", "properties": { "ops_monitoring": { "mem_total_kb": 50331648 } }, "ports": [ { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth0", "href": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1%3Aeth0" }, { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth1", "href": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1%3Aeth1" }, { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth2", "href": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1%3Aeth2" }, { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth3", "href": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1%3Aeth3" } ] }
Data about an interface
Interfaces have an ID which is a URL-sanitized version of their URN and are described using the port schema. Notes:
- I adopted the control/experimental terminology for interface roles from ProtoGENI listresources output. We could also use control/data; at any rate, we should be consistent among all monitoring uses.
- All bandwidths are total fiction. I didn't even count the zeroes.
Examples:
Call for pc1 eth0 (control) at gpo-ig:
https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1:eth0Response:
{ "$schema": "http://unis.incntre.iu.edu/schema/20120709/port#", "selfRef": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1:eth0", "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth0", "ts": 1391194147100678, "id": "instageni.gpolab.bbn.com_interface_pc1:eth0", "address": { "type": "ipv4", "address": "192.1.242.140" }, "properties": { "ops_monitoring": { "role": "control", "max_bps": 10000000, "max_pps": 1000000 } } }Call for pc1 eth1 (dataplane) interface at gpo-ig:
https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1:eth1
{ "$schema": "http://unis.incntre.iu.edu/schema/20120709/port#", "selfRef": "https://datastore.instageni.gpolab.bbn.com/interface/instageni.gpolab.bbn.com_interface_pc1:eth1", "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+interface+pc1:eth1", "ts": 1391194147100678, "id": "instageni.gpolab.bbn.com_interface_pc1:eth1", "address": { "type": "mac", "address": "aa:aa:aa:aa:aa:ab" }, "properties": { "ops_monitoring": { "role": "experimental", "max_bps": 10000000, "max_pps": 1000000 } } }
Data about a slice authority
GENI slice authorities are indexed by domain name and described using the authority schema. Examples:
Call for authority of ch.geni.net:
https://datastore.ch.geni.net/authority/ch.geni.netResponse:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/authority#", "id": "ch.geni.net", "selfRef": "https://datastore.ch.geni.net/authority/ch.geni.net", "urn": "urn:publicid:IDN+ch.geni.net+authority+ch", "ts": 1391192685740849, "users": [ { "urn": "urn:publicid:IDN+ch.geni.net+user+tupty", "href": "https://datastore.ch.geni.net/user/tupty" } ], "slices": [ { "urn": "urn:publicid:IDN+ch.geni.net:gpo-infra+slice+tuptyexclusive", "href": "https://datastore.ch.geni.net/slice/ch.geni.net_gpo-infra_slice_tuptyexclusive" } ] }
Data about a GENI user
GENI users have an ID based on the username and are described using the GENI user schema. Examples:
- tupty at ch.geni.net:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/user#", "id": "tupty", "selfRef": "https://datastore.ch.geni.net/user/tupty", "urn": "https://datastore.ch.geni.net/user/tupty", "ts": 1391192685740849, "authority": { "urn": "urn:publicid:IDN+ch.geni.net+authority+ch", "href": "https://datastore.ch.geni.net/authority/ch.geni.net" }, "fullname": "Tim Exampleuser", "email": "tim@example.com" }
Data about a GENI slice
GENI slices have an ID based on the URN and are described using the GENI slice schema. Examples:
- tuptyexclusive slice:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/slice#", "id": "ch.geni.net_gpo-infra_slice_tuptyexclusive", "selfRef": "https://datastore.ch.geni.net/slices/ch.geni.net_gpo-infra_slice_tuptyexclusive", "urn": "urn:publicid:IDN+ch.geni.net:gpo-infra+slice+tuptyexclusive", "uuid": "8c6b97fa-493b-400f-95ee-19accfaf4ae8", "ts": 1391192685740849, "authority": { "urn": "urn:publicid:IDN+ch.geni.net+authority+ch", "href": "https://datastore.ch.geni.net/authority/ch.geni.net" }, "created": 1391626683000000, "expires": 1391708989000000, "members": [ { "urn": "urn:publicid:IDN+ch.geni.net+user+tupty", "href": "https://datastore.ch.geni.net/users/tupty", "role": "lead" } ] }
Data about a GENI sliver
GENI sliver have an ID based on the URN and are described using the GENI sliver schema. Examples:
- tuptyexclusive instageni sliver:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/sliver#", "id": "instageni.gpolab.bbn.com_sliver_26947", "selfRef": "https://datastore.instageni.gpolab.bbn.com/slivers/instageni.gpolab.bbn.com_sliver_26947", "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+sliver+26947", "uuid": "30752b06-8ea8-11e3-8d30-000000000000", "ts": 1391192685740849, "aggregate": { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+authority+cm", "href": "https://datastore.instageni.gpolab.bbn.com/aggregates/gpo-ig" }, "slice_urn": "urn:publicid:IDN+ch.geni.net:gpo-infra+slice+tuptyexclusive", "slice_uuid": "8c6b97fa-493b-400f-95ee-19accfaf4ae8", "creator": "urn:publicid:IDN+ch.geni.net+user+tupty", "created": 1391626683000000, "expires": 1391708989000000, "resources": [ { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+node+pc1", "href": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "is_exclusive": false }, { "urn": "urn:publicid:IDN+instageni.gpolab.bbn.com+node+pc2", "href": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc2", "is_exclusive": false } ] }
Measurements used for the use cases
Measurements have an opaque ID which is generated by the local datastore which serves them, and must be persistent, so that the caller has the option of asking for the measurement by ID. They are described using the data schema outlined above. Examples:
- CPU utilization metric on pc1:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "1", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:cpu_utilization", "description": "CPU utilization percentage", "units": "percent", "tsdata": [ { "ts": 1391198716651283, "val": 0.45 }, { "ts": 1391198776651284, "val": 0.44 }, { "ts": 1391198836651284, "val": 0.44 }, { "ts": 1391198896651284, "val": 0.47 }, { "ts": 1391198956651284, "val": 0.46 }, { "ts": 1391199016651285, "val": 0.47 } ] }
- Percentage of swap available on pc1:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "2", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:swap_free", "description": "Percentage of swap available", "units": "percent", "tsdata": [ { "ts": 1391198716651283, "val": 0.95 }, { "ts": 1391198776651284, "val": 0.95 }, { "ts": 1391198836651284, "val": 0.95 }, { "ts": 1391198896651284, "val": 0.95 }, { "ts": 1391198956651284, "val": 0.95 }, { "ts": 1391199016651285, "val": 0.95 } ] }
- Memory in active use on pc1:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "3", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:mem_active_kb", "description": "Memory in active use", "units": "kilobytes", "tsdata": [ { "ts": 1391198716651283, "val": 20030048 }, { "ts": 1391198776651284, "val": 20031148 }, { "ts": 1391198836651284, "val": 20031148 }, { "ts": 1391198896651284, "val": 22222222 }, { "ts": 1391198956651284, "val": 22222222 }, { "ts": 1391199016651285, "val": 22222222 } ] }
- Bytes per second received by
pc1:eth0
:{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "5", "subject": "https://datastore.instageni.gpolab.bbn.com/ports/instageni.gpolab.bbn.com_interface_pc1:eth0", "eventType": "ops_monitoring:rx_bytes", "description": "bytes per second received on this interface", "units": "float", "tsdata": [ { "ts": 1391198716651283, "val": 2453.64 }, { "ts": 1391198776651284, "val": 800.2 }, { "ts": 1391198836651284, "val": 2400.3 }, { "ts": 1391198896651284, "val": 1984.3 }, { "ts": 1391198956651284, "val": 0 }, { "ts": 1391199016651285, "val": 0 } ] }
- Boolean metric indicating whether pc1 is available for use according to the aggregate responsible for it:
{ "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "5", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:is_available", "description": "is the subject node available, according to the aggregate", "units": "boolean", "tsdata": [ { "ts": 1391198716651283, "val": 1 }, { "ts": 1391198776651284, "val": 1 }, { "ts": 1391198836651284, "val": 1 }, { "ts": 1391198896651284, "val": 1 }, { "ts": 1391198956651284, "val": 0 }, { "ts": 1391199016651285, "val": 0 } ] }
Bulk Data Queries
Queries on multiple eventTypes and object ID's will be presented in a list format. Here is an example query for event types cpu utilization and active memory utilization for nodes pc1 and pc2 of instageni-bbn. Also it is advised to provide timestamp filters with data queries. Here is the format for timestamp filters .
[ { "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "1", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:cpu_utilization", "description": "CPU utilization percentage", "units": "percent", "tsdata": [ { "ts": 1391198716651283, "val": 0.45 }, { "ts": 1391198776651284, "val": 0.44 }, { "ts": 1391198836651284, "val": 0.44 }, { "ts": 1391198896651284, "val": 0.47 }, { "ts": 1391198956651284, "val": 0.46 }, { "ts": 1391199016651285, "val": 0.47 } ] } , { "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "1", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc2", "eventType": "ops_monitoring:cpu_utilization", "description": "CPU utilization percentage", "units": "percent", "tsdata": [ { "ts": 1391198716651283, "val": 0.45 }, { "ts": 1391198776651284, "val": 0.48 }, { "ts": 1391198836651284, "val": 0.45 }, { "ts": 1391198896651284, "val": 0.49 }, { "ts": 1391198956651284, "val": 0.50 }, { "ts": 1391199016651285, "val": 0.51 } ] } , { "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "3", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc1", "eventType": "ops_monitoring:mem_active_kb", "description": "Memory in active use", "units": "kilobytes", "tsdata": [ { "ts": 1391198716651283, "val": 30030048 }, { "ts": 1391198776651284, "val": 30031148 }, { "ts": 1391198836651284, "val": 30031148 }, { "ts": 1391198896651284, "val": 32222222 }, { "ts": 1391198956651284, "val": 32222222 }, { "ts": 1391199016651285, "val": 32222222 } ] } , { "$schema": "http://www.gpolab.bbn.com/monitoring/schema/20140131/data#", "id": "3", "subject": "https://datastore.instageni.gpolab.bbn.com/nodes/instageni.gpolab.bbn.com_node_pc2", "eventType": "ops_monitoring:mem_active_kb", "description": "Memory in active use", "units": "kilobytes", "tsdata": [ { "ts": 1391198716651283, "val": 20030048 }, { "ts": 1391198776651284, "val": 20031148 }, { "ts": 1391198836651284, "val": 20031148 }, { "ts": 1391198896651284, "val": 22222222 }, { "ts": 1391198956651284, "val": 22222222 }, { "ts": 1391199016651285, "val": 22222222 } ] } ]