Changes between Version 1 and Version 2 of GENI-Infrastructure-Portal/MonitoringReqts
- Timestamp:
- 12/14/11 14:18:13 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GENI-Infrastructure-Portal/MonitoringReqts
v1 v2 11 11 * Would like to ensure: 12 12 * We don’t ignore any important pieces 13 * Architecture decisions reflect needs of monitoring so that GENI Clearinghouse, I&M, etc serve our needs 13 * Architecture decisions reflect needs of monitoring so that GENI 14 Clearinghouse, Instrumentation & Measurement, etc serve our needs 14 15 * Where possible, we build tools which can be adapted to new software when it becomes available 15 16 * Therefore we need to answer the following questions: … … 21 22 22 23 * GENI Requires Monitoring 24 The following are some relevant requirements from the 25 [attachment:SysReqDoc/GENI-SE-SY-RQ-02.0.pdf GENI System Requirements Document] (July 7, 2009) 26 23 27 * 10.2-3 Visible operational status 24 28 * The GENI system shall make sufficient data available that researchers and maintainers will be able to evaluate the availability and operational status of the system. … … 33 37 * Examples: ProtoGENI, !PlanetLab, Orca 34 38 * Campuses 35 * Which host & runaggregates39 * Which host & operate aggregates 36 40 * Which only host aggregates 37 41 * Backbone & Regional Networks … … 40 44 * Experimenters 41 45 42 * Aggregates, Racks and the GENI C H43 * Each GENI rack is a SINGLEaggregate46 * Aggregates, Racks and the GENI Clearinghouse 47 * Each GENI rack is a ''single'' aggregate 44 48 * Therefore requirements are the same as for aggregates 45 49 * Aggregates can outsource (some of) their responsibilities to the GENI Clearinghouse … … 50 54 * Management 51 55 * Act of fixing problems and responding to requests 52 53 54 55 56 57 58 59 60 56 * What does monitoring & management involve? 57 * Observe unexpected events 58 * THEN fix what’s wrong 59 * Observe expected events 60 * THEN develop policy for fixing what’s wrong 61 * THEN fix what’s wrong (by responding to monitoring) 62 * Plan for the future 63 * Monitor long-term trends in resource usage 64 * THEN provision resources to meet forecasted needs 61 65 62 66 * What makes GENI different? … … 66 70 * Interactions between groups are governed by GENI federation agreements (e.g. aggregate provider agreement) and mutual understanding. 67 71 68 * Motherhood & Apple Pie 69 * We are not covering: 70 * Monitoring and management which fits entirely within the purview of aggregates, campuses, etc 71 * For example, we will do (but not discuss here) 72 * The requirements below do not cover: 73 * Monitoring and management which fits entirely within the purview of aggregates, campuses, etc 74 * For example, we will do (but not discuss in these requirements) the following items 72 75 * Keeping logs 73 76 * Obeying local laws and policy 74 77 * Answering the phone when someone has a concern 75 * … and tie your shoes andeverything else.76 * The se thingsdo NOT make GENI different77 == Top-level aspects of GENI Monitoring and Management==78 79 These are the top-level aspects of GENI monitoring and management...80 81 * Top-level Aspects of GENI Monitoring78 * … and everything else. 79 * The above items are not included in the requirements below because they do NOT make GENI different 80 == GENI Monitoring and Management Requirements Summary == 81 82 These are the summary GENI monitoring and management requirements... 83 84 A. GENI Monitoring Requirements 82 85 1. Information must be shareable 83 1. Information must be collected84 1. Information must be available when needed85 1. Cross-GENI operational statisticscollected and synthesized to indicate GENI as a whole is working86 1. Preserve privacy of users (opt-in, experimenters, other users86 2. Information must be collected 87 3. Information must be available when needed 88 4. Cross-GENI operational statistics must be collected and synthesized to indicate GENI as a whole is working 89 5. Preserve privacy of users (opt-in, experimenters, other users 87 90 of resources) 88 91 89 * Top-level Aspects of GENI Management92 B. GENI Management Requirements 90 93 1. For both debugging and security problems: 91 1. Must be possible to escalate events92 1. Meta-operations and aggregate operators must work together to resolve problems in a timely manner93 1. Must be possible to do an emergency stop in case of a problem94 1. Orgs must manage GENI resources consistent with local policy and best practices94 1. Must be possible to escalate events 95 2. Meta-operations and aggregate operators must work together to resolve problems in a timely manner 96 2. Must be possible to do an emergency stop in case of a problem 97 3. Organizations must manage GENI resources consistent with local policy and best practices 95 98 * e.g security procedures, logging, backups, etc 96 1. Develop policies for monitoring97 1. All parties should implement agreed upon policies98 1. Security of GENI as a whole and its pieces99 100 == Breakdown of top level requirements ==101 Each of the top-level aspects of GENI monitoring and management,99 4. Develop policies for monitoring 100 5. All parties should implement agreed upon policies 101 6. Secure GENI as a whole and secure the pieces of GENI 102 103 == Monitoring & Management Requirements Details == 104 Each of the above GENI monitoring and management requirements, 102 105 is broken down in more detail below. 103 106 … … 106 109 [[Color(orange,partially implemented according to some plan or there exists an unimplemented plan)]], or [[Color(red,not implemented and no plan)]]. 107 110 108 * Requirement: Cross-GENI Monitoring111 * A.4 Requirement: Cross-GENI Monitoring 109 112 * ''GENI monitoring is more than the sum of the monitoring at GENI’s parts. In order to know if GENI is working properly, additional monitoring is required beyond that done by each of its constituent pieces.'' 110 * Collect and synthesize additional operational statistics which indicate whether GENI is working113 * Must collect and synthesize additional operational statistics which indicate whether GENI is working 111 114 * e.g. meso-scale ping tests, topology 112 * Collect cross-GENI stats113 * M ake cross-GENI stats available when needed114 115 * Requirement: Privacy115 * Must collect cross-GENI stats 116 * Must make cross-GENI stats available when needed 117 118 * A.5 Requirement: Privacy 116 119 * Preserve privacy of users (opt-in, experimenters, other users of resources) 117 120 * [[Color(red,TBD – This is an area needing major discussion)]] 118 121 119 * Requirement: Troubleshooting & Event Escalation122 * B.1 Requirement: Troubleshooting & Event Escalation 120 123 * For both debugging and security problems: 121 * Meta-operations and aggregate operators must work together to resolve problems 124 * B.1.1 Must be possible to escalate events 125 * B.1.2 Meta-operations and aggregate operators must work together to resolve problems 122 126 * Aggregates must advertise resources accurately 123 127 * (threshold) statically --> [[Color(green,Fill out aggregate page)]] … … 126 130 * Aggregates cooperate with meta-operations on the resolution of security events 127 131 * Aggregates cooperate with LLR on the resolution of security events 128 * Must be possible to escalate events 129 130 131 * Requirement: Emergency Stop132 133 134 135 * B.2 Requirement: Emergency Stop 132 136 * Must be possible to do an emergency stop in case of a problem 133 137 * Must maintain POC information at meta-operations … … 140 144 141 145 * Requirement: Policy 142 * Orgs must manage GENI resources consistent with local policy and best practices (e.g security procedures, logging, backups, etc)146 * B.3 Organizations must manage GENI resources consistent with local policy and best practices (e.g security procedures, logging, backups, etc) 143 147 * In general, follow local policy and procedures 144 148 * Follow best practices which if not followed would affect other members of the GENI community 145 * Develop policies for monitoring146 * All parties should implement agreed upon policies149 * B.4 Develop policies for monitoring 150 * B.5 All parties should implement agreed upon policies 147 151 * Follow Aggregate Provider Agreement 148 152 * Follow LLR 149 153 * Follow other GENI policies as they come into effect 150 154 151 * Requirement: Security152 * Secur ity of GENI as a whole and its pieces155 * B.6 Requirement: Security 156 * Secure GENI as a whole and secure the pieces of GENI 153 157 * Two things we want to prevent: 154 158 * Compromise of GENI resources … … 162 166 163 167 * Requirement: Info must be shareable/collected/available 164 * Information must be shareable168 * A.1 Information must be shareable 165 169 * Consistent definitions of data 166 170 * Consistent data exchange format … … 169 173 * The following benefit from shared common processes: 170 174 * Accessing data, finding data, visualizing data 171 * Information must be collected175 * A.2 Information must be collected 172 176 * Verify continued successful data collection 173 177 * Debug collection and reliability outages 174 * Information must be available when needed178 * A.3 Information must be available when needed 175 179 * Privacy of data must be maintained 176 180 … … 178 182 * Data Definitions 179 183 * Consistent definition of data 180 * Relational data 184 * Relational data -- data which explains the relationship 185 between entities and resources 181 186 * Resources (incl. connectivity) 182 187 * List of aggregates … … 184 189 * List of users 185 190 * Aggregate contact information 186 * Timeseries data 191 * Timeseries data -- data collected repeatedly at a regular interval 187 192 * Examples: Host and network statistics 188 * Events 193 * Event data -- data with information about a unique event 194 occuring at a single point in time 189 195 * Examples: SNMP Traps 190 196 … … 204 210 205 211 * General: Using Data 206 * Sharing Data 207 * --> [[Color(green,publish to central DB at GMOC)]]208 * --> [[Color(green,publish locally via webpage or local API)]]209 * --> [[Color(red,TBD: publish via a distributed mechanism)]]210 * Accessing, Finding and Visualizing Data 211 * --> [[Color(green,GMOC Portals)]]212 * --> [[Color(green,GMOC SNAPP Interface (with search))]]213 * --> [[Color(green,GMOC data available to interested consumers via API)]]214 * --> [[Color(red,TBD: More to do here)]]212 * Sharing Data is currently done in three ways 213 * Sharing Data --> [[Color(green,publish to central DB at GMOC)]] 214 * Sharing Data --> [[Color(green,publish locally via webpage or local API)]] 215 * Sharing Data --> [[Color(red,TBD: publish via a distributed mechanism)]] 216 * Accessing, Finding and Visualizing Data is currently done in four ways 217 * Accessing, Finding and Visualizing Data --> [[Color(green,GMOC Portals)]] 218 * Accessing, Finding and Visualizing Data --> [[Color(green,GMOC SNAPP Interface (with search))]] 219 * Accessing, Finding and Visualizing Data --> [[Color(green,GMOC data available to interested consumers via API)]] 220 * Accessing, Finding and Visualizing Data--> [[Color(red,TBD: More to do here)]] 215 221 216 222 * Other people who need data