Changes between Version 15 and Version 16 of GENIOperationsTrial/GENIStitchingCheck
- Timestamp:
- 07/01/15 15:58:48 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GENIOperationsTrial/GENIStitchingCheck
v15 v16 17 17 == 1.2 Steps for SCS - Server Check == 18 18 19 1. Login to [http:// tamassos.gpolab.bbn.com/nagios3/ GPO Nagios System]20 2. Select ` Service Groups->Summary` in the left hand navigation bar.21 3. Look for `GENI Stitching Computation Services`, and select the `"Host Status Summary"` for the SCS check to see details.19 1. Login to [http://alerts.gpolab.bbn.com/nagios3/ GPO Nagios System] 20 2. Select `Host Groups->Summary` in the left hand navigation bar. This will bring you to this page [[BR]] [[Image(Nagios-Host Groups Summary.png,80%)]] 21 3. Look for `GENI Stitching Computation Services`, and click on the `GENI Stitching Computation Services` link in the `Host Group` column to see details. This will bring you to the `Service Overview For Host Group` page. [[BR]] [[Image(Nagios-Service Overview For Host Group SCS.png,80%)]] 22 22 4. Resulting page shows the SCS systems being monitored. There are currently 2 instances being monitored: Internet2 SCS and the Test SCS at MAX. 23 5. Review the Internet2 SCS `scs-geni` status, which can be `UP` or `DOWN`. 24 6. Review the Test SCS `scs-geni-test` status, which can be `UP` or `DOWN`. 23 5. Review the Internet2 SCS `scs-geni` service status, which can be `UP`, `DOWN` or `PENDING`. 24 6. Click on the `scs-geni` link of the Host` column. This will bring you to the `Service Status Details For Host scs-geni` page. [[BR]] [[Image(Service Status Details For Host scs-geni 1.png,80%)]] 25 7. Look for a service named `gpo:is_available` and confirm that the status is `UP`. [[BR]] [[Image(Service Status Details For Host scs-geni 2.png,80%)]] 26 8. Go back to the `Service Overview For Host Group` page reached in `Step 3`. Review the Test SCS `scs-geni-test` service status, which can be `UP`, `DOWN` or `PENDING`. 27 9. Click on the `scs-geni-test` link of the Host` column. This will bring you to the `Service Status Details For Host scs-geni-test` page. 28 10. Look for a service named `gpo:is_available` and confirm that the status is `UP`. 29 30 ''Note:'' the host status themselves is expected to always be `PENDING`. 25 31 26 32 == 1.3 SCS - Server Check - Pass Criteria == 27 33 28 If step 5shows `UP`, than the Production Internet2 SCS check passes. A value of `UP` means that the SCS service is running and answering a `listaggregates` request. The monitoring system also verifies that the list of aggregates returned includes the expected sites.34 If step 7 shows `UP`, than the Production Internet2 SCS check passes. A value of `UP` means that the SCS service is running and answering a `listaggregates` request. The monitoring system also verifies that the list of aggregates returned includes the expected sites. 29 35 30 If step 6shows `UP`, than test SCS check passes. A value of `UP` means that the SCS service is running and answering a `listaggregates` request. The monitoring system also verifies that the list of aggregates returned includes the expected sites.36 If step 10 shows `UP`, than test SCS check passes. A value of `UP` means that the SCS service is running and answering a `listaggregates` request. The monitoring system also verifies that the list of aggregates returned includes the expected sites. 31 37 32 38 == 1.4 SCS - Server Check - Fail Criteria and Escalation == 33 39 34 If step 5shows `DOWN`, than the Production SCS check fails. A value of `DOWN` means that the SCS system is either not responding or responding with the wrong list of site aggregates.40 If step 7 shows `DOWN`, than the Production SCS check fails. A value of `DOWN` means that the SCS system is either not responding or responding with the wrong list of site aggregates. 35 41 36 42 '''__Escalation:__''' SCS status issues should be escalated to GMOC. 37 43 38 44 39 If step 6shows `DOWN`, than the check fails. A value of `DOWN` means that the test SCS system is either not responding or responding with the wrong list of site aggregates.45 If step 10 shows `DOWN`, than the check fails. A value of `DOWN` means that the test SCS system is either not responding or responding with the wrong list of site aggregates. 40 46 41 47 '''__Escalation:__''' Test SCS issues should be escalated to the MAX Development team. … … 52 58 53 59 1. Login to [http://alerts.gpolab.bbn.com/nagios3/ GPO Nagios System] 54 2. Select ` ServiceGroups->Summary` in the left hand navigation bar.60 2. Select `Host Groups->Summary` in the left hand navigation bar. 55 61 3. Look for `GENI SCS Path Availability -> Service Status Summary` for the SCS check. This provides an overall status. 56 62 4. Status can be `UP`, `DOWN` or `PENDING`. If some results are `DOWN` or `PENDING`, select each to see the details.