wiki:GENIRacksAdministration

Version 24 (modified by lnevers@bbn.com, 10 years ago) (diff)

--

GENI Racks Administration

This page captures a site administrator's perspective on GENI rack installation, maintenance, and support. Detailed questions and answers that other site admins have found helpful are also available in the InstaGENI FAQ and ExoGENI FAQ pages. If you cannot find the information you are looking for, please contact us at help@geni.net for any questions. The GPO also provides a real-time public IRC chat room where engineers are often available, chat.freenode.net channel #geni, for debugging any issues you may encounter. See HowTo/ConnectToGENIChatRoom for details.

Before Your Rack Arrives

Each GENI site should provide the following high-level support for a GENI rack:

  • Provide space, power, security (as with other site resources)
  • Provide at least 1Gbps OpenFlow/SDN data path between the site rack and the upstream research I2/NLR network (10GE or 40GE is also possible, depending on the type of rack)
  • Provide an SDN path from the GENI rack to downstream campus subscribers who are interested in connecting to GENI (SDN paths are typically Layer2 Ethernet VLANs)
  • Operate with up-to-date GENI-specified software (e.g. AM API, OpenStack, Xen)
  • Provide no-cost access to rack resources for GENI authorized users at other campuses (Access is controlled by experimenters' and operators' individual GENI credentials)
  • Provide points of contact for the GENI response team, as specified in the Aggregate Provider Agreement. The points of contact support debugging, software updates, and Emergency Stop requests for the rack.

The Compute and OpenFlow (FOAM) aggregates in the rack require a small amount of attention from an administrator on an ongoing basis, when there are no emergencies or outages. The amount of time you spend on these administrative tasks mostly depends on your site policy. To reduce the administrative workload, you may request to have the FOAM aggregate configured for auto approval of some or all OpenFlow experiment requests. If not requested, FOAM will require manual administrative approval for all requests. For more detail see the FOAM administration introduction page.

When monitoring reports a problem with your rack (e.g. an apparent power failure), we will email your site contact mailing list to ask for help with resolving the issue. Occasionally the rack team may also ask for help from a site contact (e.g. confirming bad hardware). The rack teams currently handle all software upgrades to the rack without requiring help from a site administrator. Site contacts must notify the GMOC when there is a scheduled outage or a problem observed by the site. Here is some more info on how to report a problem at the GENI GMOC Report a Problem page. There are additional GMOC support pages where you can view the Operations Calendars, Operations Bi-Weekly Reports and existing Trouble Tickets.

InstaGENI Rack Deployment

The InstaGENI team sends an email with information about what's needed for the rack installation and warranty support to the main IT contact for a potential new rack site. Site contacts fill out an InstaGENI site questionnaire as soon as possible after ordering their rack. The questionnaire requests networking details and administrator account information that InstaGENI engineers need to pre-configure your rack. The team creates your initial administrator account, which provides access to all devices in the rack via SSH public keys, and allows you to create additional administrator accounts. The GPO coordinates additional site configuration and network integration activities for the deployment of your rack. You will be asked to play a role or provide information as part of the four activities below:

  1. The GPO will contact you to determine how your rack will connect to the GENI core networks. The GPO will help engineer your site's layer 2 data paths and the shared, exclusive and stitching VLAN options to configure for your network connections. For details see the Meso-scale Connection Requirements page. The GPO also adds a test point host to monitor your site's OpenFlow access. GMOC monitoring is set up to gather operational data about your site. There are no actions required of a site administrator to set up monitoring, but if you are interested in more details, see the GENI Operations Monitoring page. During this integration phase, your sites will need to accept the Aggregate Provider Agreement.
  1. The GPO will execute experimenter, administrative and monitoring tests on each InstaGENI site. As part of these tests, you will be asked to add an initial administrator account for the GPO, and you will be asked to remove the account upon test completion. Instructions for adding and removing InstaGENI Administrative accounts can be found here. The InstaGENI New Site Confirmation Test Plan can be found here, and status for all InstaGENI Site Confirmation tests can be found here.
  1. As part of the pre-production activities, each site is asked to run tests to verify readiness for production. This testing opens your site to experimenters and verifies your ability to support them. Site administrative contacts also join the response-team@geni.net mailing list at this stage. (This is a GENI-wide mailing list whose members respond to issues with GENI production resources). Finally, the site works with the GMOC to provide emergency contact information, and transitions to being actively supported by the GMOC.
  1. Your site moves to production status, following the Production Release procedure. This includes four steps:

a) Site added to the the GENI clearinghouse and to the Utah ProtoGENI clearinghouse,
b) Site added to the aggregates listed in the GENI Portal and in the Omni package,
c) Site marked as a production resource in monitoring and
d) Site officially announced and tracked as production.

Note: For a list of sites currently in production, see the GENI Production Resources page.

InstaGENI Rack Maintenance

All rack maintenance activities are announced by the GMOC on the response-team@geni.net mail list. GMOC also provides an operations calendar for both scheduled and unscheduled maintenance activities here.

Currently, the InstaGENI team centrally handles updating software on all racks. Updates take place in a maintenance window each Friday at 3pm (Pacific).

InstaGENI Rack Monitoring

To access operational data gathered for your site see the GMOC Live Database. Note that you will need an OpenID, InCommon, or GlobalNOC account to access your site data.

ExoGENI Rack Deployment

ExoGENI rack deployments

  1. An ExoGENI engineer will contact your site to collect information for configuring and installing your rack. The racks are built at IBM, and pre-configured at the IBM integration center. The ExoGENI rack team supports the initial rack installation remotely, once you've connected your new rack to power and external network cables. The ExoGENI operations team verifies connectivity to the rack components and completes detailed configuration for all rack components (e.g. storage and switches). They then configure ORCA and OpenFlow software, along with cloud and GENI federation software. You may be asked to provide some support for these activities.
  1. The GPO and ExoGENI teams will coordinate to integrate your site into the GENI network. For details, see the Meso-scale Connection Requirements page. The GPO also adds a test point host to monitor your site's OpenFlow access. During this integration phase, your site will need to accept the Aggregate Provider Agreement.
  1. The GPO will execute experimenter, administrative and monitoring tests on each site ExoGENI site. As part of this test, the GPO requests an administrator account from the ExoGENI team by sending email to exogeni-ops@renci.org to request LDAP credentials. Note that only ExoGENI team can add accounts to the central LDAP master and that administrative privileges are granted via sudo, which is dependent upon LDAP group membership. For more information see the Rack Operators page.
  1. As part of the pre-production activities, each site is asked to run tests to verify readiness for production. This testing opens your site to experimenters and verifies your ability to support them. Site administrative contacts also join the response-team@geni.net mailing list at this stage. (This is a GENI-wide mailing list whose members respond to issues with GENI production resources). The site alos works with the GMOC to provide emergency contact information, and transitions to being actively supported by the GMOC. Additionally, site contacts should register for the geni-orca-users@googlegroups.com mail list.
  1. Your site moves to production status, following the Production Release procedure. This includes four steps:

a) Site added to the the GENI clearinghouse and to the Utah ProtoGENI clearinghouse,
b) Site added to the aggregates listed in the GENI Portal and in the Omni package,
c) Site marked as a production resource in monitoring and
d) Site officially announced and tracked as production.

Note: For a list of sites currently in production see the GENI Production Resources page.

ExoGENI Rack Maintenance

All rack maintenance activities are announced by the ExoGENI team on the geni-orca-users@googlegroups.com mail list. GMOC also provides an operations calendar for both scheduled and unscheduled maintenance activities here.

ExoGENI Rack Monitoring

To access operational data gathered for your site see the GMOC Live Database. Note that you will need an !OpenID, InCommon, or GlobalNOC account to access your site data.

ExoGENI racks also includes a Nagios installation which uses the Check_MK plugin to retrieve data and WATO, a Check_MK's Web Administration Tool. Each rack supports the WATO web interface to get access to rack statistics. For example, Nagios information for the BBN rack can be accessed at https://bbn-hn.exogeni.net/rack_bbn/check_mk/, links for your site will vary based on the site name, simply replace bbn in the URL with your site name to get access to the monitoring date. For example, if your site is FIU, the URL is https://fiu-hn.exogeni.net/rack_fiu/check_mk/.


Email help@geni.net for GENI support or email me with feedback on this page!