Changes between Version 2 and Version 3 of OperationalMonitoring/Overview


Ignore:
Timestamp:
02/18/14 09:49:22 (10 years ago)
Author:
rirwin@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OperationalMonitoring/Overview

    v2 v3  
    11= Overview of Operational Monitoring =
     2
     3== Introduction ==
    24
    35The monitoring architecture is based on the concept of distributing sources of information in a common fashion.  The sources of information (relational data or time-series data) is placed into what are called "Local Datastores".  These datastores have a common REST polling API for retrieving information.  The component that automatically retrieves data from the Local Datastores are called "Aggregators".
     
    2729
    2830Although there are multiple aggregators, a single monitoring application uses only a single aggregator.
     31
     32== Use Cases ==
     33
     34=== Detail of proposed monitoring system components for (use case 3) ===
     35
     36Use case description: Track node compute utilization, interface, and health statistics for shared rack nodes, and allow operators to get notifications when they are out of bounds
     37
     38Use case implementation story: Node statistics are time-series data, and are either collected on the node and pushed to the compute aggregate, or polled from each node by the compute aggregate (doesn't matter for our purposes). Statistics end up in a local database on each rack. Any group of operators that wants to send notifications on these statistics runs an aggregator, which polls all racks of interest to that group. The aggregator shares current values with an alerting service, which sends alerts.
     39
     40{{{
     41#!html
     42
     43<table border="0" style="float:center">
     44  <tr>
     45       <td> <img src="http://groups.geni.net/geni/attachment/wiki/OperationalMonitoring/Overview/use_case_3?format=raw" width="500" height="317"> </td>
     46  </tr>
     47  <caption align="bottom"> <b> Simple Rack Health Statistics </b></caption>
     48</table>
     49}}}