Changes between Version 2 and Version 3 of GENIMonitoring/Alerts


Ignore:
Timestamp:
05/12/15 16:23:11 (6 years ago)
Author:
cody@uky.edu
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIMonitoring/Alerts

    v2 v3  
    33= GENI Monitoring Alerts =
    44
    5 The GENI monitoring alerts system is based on the detection of events based on metric data that polled from remote systems.  As part of the polling process raw data is both recorded in a database and pushed to a queue.  The queue serves as a fanout interface for a one-to-many raw metric subscription service.
    65
    7 [[Image(https://www.rabbitmq.com/img/tutorials/python-three-overall.png)]]
     6The GENI monitoring alerts system is based on the detection of events based on metric data that polled from remote systems.  Raw data is published to a queueing system, which allows multiple complex event queries to operate on the same data stream in parallel.  Output of complex queries can generate Nagios alerts, log results to a database, or both.       
     7
     8== Poll to raw metric stream ==
     9As part of the polling process raw data is both recorded in a database and pushed to a queue.  The queue serves as a fanout interface for a one-to-many raw metric subscription service.
     10
     11[[Image(https://www.rabbitmq.com/img/tutorials/python-three-overall.png)]]*
    812
    913In the previous figure ''P'' represents our polling agent, which publishes data to a queue exchange represented by ''X''.  Clients, designated as ''C1'' and ''C2'', subscribe to exchanges by binding their own queues to exchanges.  In the example, data published by ''P'' is replicated by ''X'' to client queues ''amq.gen-RQ6..'' for client ''C1'' and ''amq.gen-As8...'' for client ''C2''.     
    1014
     15== Stream query of metric stream ==
    1116
    12 Alert data is obtained are based on a publish/subscribe queuing system that allows for pattern-based (matching) subscriptions. State transition and utilization will be emitted on this queue. 
     17The publish/subscribe queuing system allows streams of raw metric data to be replicated between many processes in parallel.  This allows us to instantiate one or more complex event processing engines ''CEPE'' per replicated data stream and one or more queries inside of each CEPE.  We make use of the Esper [http://www.espertech.com/] CEPE.   
     18
     19==== Esper Complex Event Processing Engine ====
     20Esper allows us to analyze large volumes of incoming messages or events, regardless of whether incoming messages are historical or real-time in nature. Esper filters and analyzes events in various ways, and respond to conditions of interest.  An example of the Esper CEPE architecture is shown in the figure below.
     21
     22[[Image(http://www.espertech.com/images/products_esp_cep.jpeg)]]**
     23
     24Simply, ''CEPE queries'' are pattern-based (matching) subscriptions describing a possible future event. If the described event occurs, a described output is emitted from the CEPE.
     25
     26==== Esper Queries ====
     27
     28In a typical database we query existing data based on some declarative language.  We can think of and Esper query like an upside down SQL, where if events occur in the future, results will be emitted.  The Using the ESPER query language, ''EPL'' (similar to SQL) complex events can are described.
    1329
    1430
     31Consider the following EPL query: select count(*) from MyEvent(somefield = 10).win:time(3 min) having count(*) >= 5
    1532
    1633
    17 *Queue images are from RabbitMQ tutorials [https://www.rabbitmq.com/tutorials/tutorial-three-python.html]
     34*Image from RabbitMQ tutorial [https://www.rabbitmq.com/tutorials/tutorial-three-python.html]
     35*Image from Esper [http://www.espertech.com/]