wiki:URNConsolidation

Version 6 (modified by lnevers@bbn.com, 8 years ago) (diff)

--

GENI Switch Consolidation Procedure

This page defines the steps required to update stitching to handle PoP device consolidation that is taking place in Internet2 AL2S. This consolidation effort will replace existing AL2S Brocade devices with Juniper devices, and will converge the two distinct devices that currently provide L2 and L3 services into a single converged Juniper device in locations where AL2S services exist. These steps outline the actions required at the GENI rack, AL2S AM, and at the SCS servers to incorporate URN changes (due to port changes) resulting from the consolidation.

The steps show examples based on details from previous switch consolidation and their effect on GENI stitching sites connected to this switch.

0. Generate Tickets and check for conflicts with upcoming ticketed GENI events

Make sure tickets are opened at GMOC for the events listing all affected GENI resources. Also make sure that GMOC generates corresponding requests to Internet2 Engineering (GRNOC). Tickets should notify operators and experimenters. Adam Williams will coordinate efforts for GMOC, but initial requests should go to the usual GMOC email for ticket requests.

Note that Internet2 schedules both an IP and an AL2S outage (usually on different days) for each PoP consolidation. The IP event has no related GENI URN work needed, and will simply result in the GENI resources being unreachable (because the entire device is disconnected). The GMOC should create tickets for both events, since they both have GENI impact, and the rack admins should see the tickets if they read their GENI operators email.

Internet2 won't change their schedule, but you should notify any conflicting events about the maintenance and work with them to avoid any impact as much as possible.

If the consolidation event goes longer than the scheduled tickets, be sure to email updates to the GMOC when you know that will happen, and every 2 hours thereafter. If the event will continues to the next day, say so in your last ticket update, and tell them when you'll check in again the next day. (You don't have to update in between).

If there are any significant problems during the event, be sure to escalate to Heidi Dempsey (hdempsey@bbn.com) while you work on them (in addition to noting them in the ticket).

1. Find Current Stitching Configuration

The GENI aggregate advertisement includes a stitching section which defines how VLANs are to be connected and which VLANs are associated with that stitching site. To determine the impact of a consolidation on stitching you must start by collecting the the AL2S advertisement and reviewing its stitching definitions:

   omni -a al2s listresources -o

Review the content of the stitching section in the output file rspec-al2s-internet2-edu.xml and see if there any site affected for the switch being consolidated.

For example there were several stitching endpoints for sdn-sw.newy32aoa.net.internet2.edu. Here is the list from an AL2S Advertisement:

 <stitch:node id="urn:publicid:IDN+al2s.internet2.edu+node+sdn-sw.newy32aoa.net.internet2.edu">
 <stitch:port id="urn:publicid:IDN+al2s.internet2.edu+stitchport+sdn-sw.newy32aoa.net.internet2.edu:eth1/1">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth1/1:iminds">
 <stitch:port id="urn:publicid:IDN+al2s.internet2.edu+stitchport+sdn-sw.newy32aoa.net.internet2.edu:eth5/2">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-og">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-eg">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-ig">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth5/2:host-gpolab">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth5/2:umass-eg">
 <stitch:port id="urn:publicid:IDN+al2s.internet2.edu+stitchport+sdn-sw.newy32aoa.net.internet2.edu:eth7/2">
 <stitch:link id="urn:publicid:IDN+al2s.internet2.edu+interface+sdn-sw.newy32aoa.net.internet2.edu:eth7/2:nysernet-ig">

From the above we will only request the "<stitch:link id" to be updated, the "<stitch:port id" transitions are implicit. From the above New York switch list there are 6 aggregates (2 InstaGENI, 2 ExoGENI, 1 OpenGENI, 1 network aggregate(iMinds) and 1 fixed endpoint (host-gpolab).

In Stitching a fixed endpoint is a resource that is not a GENI aggregate but still supports stitching. Fixed endpoints are statically configured in the SCS servers to capture stitching information and are generally set up for specific demonstrations, or peering points.

2. Define Stitching Configuration Changes

Review Internet2 announced changes for switch names and ports. Based on the information, identify the changes to be made to stitching definitions.

For example, using details from the consolidation email from Internet2 for the New York Switch:

 Old Hostname: sdn-sw.newy32aoa.net.internet2.edu
 New Hostname: rtsw.newy32aoa.net.internet2.edu
        'Old Interface'                       'New Interface'
 100GigabitEthernet1/1   100GE                   et-3/1/0.0
 100GigabitEthernet1/2   100GE                   et-3/3/0.0
 100GigabitEthernet3/1   100GE                   et-7/1/0.0
 100GigabitEthernet5/2   100GE                   et-7/3/0.0 
 100GigabitEthernet7/1   100GE                   et-4/1/0.0
 100GigabitEthernet7/2   100GE                   et-4/3/0.0
 10GigabitEthernet15/1   10GE                    xe-3/0/0.0
 10GigabitEthernet15/4   10GE                    xe-3/0/1.0
 10GigabitEthernet15/5   10GE                    xe-3/0/2.0
 10GigabitEthernet15/7   10GE                    xe-3/0/3.0

From the check of the AL2S stitching Advertisement, we know that there are seven stitching sites impacted by this URN transition. Define a list of each of the exacted changes. The table below highlights each of the transitions:

Old URN New URN
sdn-sw.newy32aoa.net.internet2.edu:eth1/1:iminds rtsw.newy32aoa.net.internet2.edu:et-3/1/0.0:iminds
sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-og rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-og
sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-eg rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-eg
sdn-sw.newy32aoa.net.internet2.edu:eth5/2:gpo-ig rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-ig
sdn-sw.newy32aoa.net.internet2.edu:eth5/2:host-gpolab rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:host-gpolab
sdn-sw.newy32aoa.net.internet2.edu:eth5/2:umass-eg rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:umass-eg
sdn-sw.newy32aoa.net.internet2.edu:eth7/2:nysernet-ig rtsw.newy32aoa.net.internet2.edu:et-4/3/0.0:nysernet-ig

3. Request Stitching Changes from GENI Aggregates Operations Teams

URN transition require co-ordination with various teams. Following are the teams/contributors that handle the transition based on the type of aggregate:

Note: All Aggregates advertisements must be update before the SCS servers. The SCS discovers the new stitching path information from the Aggregates stitching advertisements. SCS is statically configured for fixed endpoints.

3a. Define Change Request Details

Based on the existing Stitching information and the announced changes, generate a list of new link ids to be used at each site. Following is an example from the New York transition, where GPO IG and NYSERNet URNs changes were request to InstaGENI Team:

Link ID:          urnpublicid:IDN+instageni.gpolab.bbn.com+interface+procurve2:5.24.al2s
Remote Link ID:   urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-ig
VLAN Range:       3596-3600,3706-3732,3746-3749

Link ID:          urnpublicid:IDN+instageni.nysernet.org+interface+procurve2:1.19.al2s
Remote Link ID:   urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-4/3/0.0:nysernet-ig
VLAN Range:       1700-1719

GPO EG URNs change were request for ExoGENI Team:

Link ID:          urnpublicid:IDN+exogeni.net:bbnNet+interface+BbnNet:IBM:G8052:GigabitEthernet:1:2:ethernet
Remote Link ID:   urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-eg
VLAN Range:       3741,3736-3739

GPO OG URNs change request for OpenGENI Team:

Link ID:          urnpublicid:IDN+bbn-cam-ctrl-1.gpolab.bbn.com+interface+force10:3:al2s
Remote Link ID:   urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-og
VLAN Range:       2611-2630

Wall2 iMinds URN changes were request to Imind Team:

Link ID:          urnpublicid:IDN+wall2.ilabt.iminds.be+interface+c300b:0.12
Remote Link ID:   urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-3/1/0.0:iminds
VLAN Range:       1125-1164

AL2S Aggregate URN Change Request for GMOC:

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-3/1/0.0:iminds
Remote Link ID:   urnpublicid:IDN+wall2.ilabt.iminds.be+interface+c300b:0.12
VLAN Range:       1125-1164

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-og
Remote Link ID:   urnpublicid:IDN+bbn-cam-ctrl-1.gpolab.bbn.com+interface+force10:3:al2s
VLAN Range:       2611-2630

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-eg
Remote Link ID:   urnpublicid:IDN+exogeni.net:bbnNet+interface+BbnNet:IBM:G8052:GigabitEthernet:1:2:ethernet
VLAN Range:       3741,3736-3739

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:gpo-ig
Remote Link ID:   urnpublicid:IDN+instageni.gpolab.bbn.com+interface+procurve2:5.24.al2s
VLAN Range:       3596-3600,3706-3732,3746-3749

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-4/3/0.0:nysernet-ig
Remote Link ID:   urnpublicid:IDN+instageni.nysernet.org+interface+procurve2:1.19.al2s
VLAN Range:       1700-1719

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:host-gpolab
Remote Link ID:   urnpublicid:IDN+gpolab.bbn.com+interface+switch:port:al2s
VLAN Range:       2646

Link ID:          urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.newy32aoa.net.internet2.edu:et-7/3/0.0:umass-eg
Remote Link ID:   urnpublicid:IDN+exogeni.net:umassNet+interface+umassNet:IBM:G8264:TenGigabitEthernet:1:1:ethernet
VLAN Range:       3581-3595

3b. Make Requests for Aggregate Updates

Send Email to each of the teams to request the above changes:

  • IG
  • EG
  • OG
  • iMinds
  • Internet2

As a courtesy, copy the rack admin contact(s) or email list from the Operators page on these requests. They don't have to take any action, but they may want to know that their racks will be potentially unable to stitch for a period of time during the scheduled outage.

Also copy the GENI Monitoring team (kathryn.wong1@uky.edu, cody@uky.edu and caylin@uky.edu). With the exception of the ATLA consolidation, this work should not require any immediate action for monitoring, but the folks at UKY may want to note the "retired" URNs in their database, and to pay extra attention to their monitoring site during these transitions.

Once the requested changes are completed, verify that the requested changes appear in each of the GENI aggregates stitching advertisement.

$ for i in gpo-ig gpo-og gpo-eg nysernet-ig al2s wall2 ; do stitcher listresources -a $i -o; done

Review all output files to verify that the correct URN is in place for each advertisement.

InstaGENI Update Details

InstaGENI updates follow the following approach:

  1. Ask geni-ops@googlgroups.com, which maps to Hussam (Hussamuddin Nasir) running the commands below on the rack boss node.
  2. Or, a site contact may asked to log into boss node and run these commands.
  3. Or, you can create an admin account on the boss node (via the web UI for the site, e.g. http://instageni.gpolab.bbn.com/ for gpo-ig) and once it is approved, you can run the commands.

Note: Options 2 and 3 are not likely to happen, as option 1 has always taken place as expected.

The InstaGENI changes that will made to the external network definition for AL2S will be used for the stitching configuration. Below is an old example of the commands used to modify the URN for the uwashington-ig external network. These commands are executed on the InstaGENI boss node:

 mysql tbdb -e 'update external_networks set external_interface="urn:publicid:IDN+al2s.internet2.edu+interface+rtsw.seat.net.internet2.edu:et-4/3/0.0:uwashington-ig" where network_id="al2s"'

 mysql tbdb -e 'update external_networks set external_wire="urn:publicid:IDN+al2s.internet2.edu+link+rtsw.seat.net.internet2.edu:et-4/3/0.0:uwashington-ig" where network_id="al2s"'

Note: Be aware of potential line wrapping pitfalls.

4. Request SCS Servers Update

In order for GENI Network Stitching to pick up these path configuration changes, an SCS update must be run. There are two SCS systems:

The Production and Test SCS include stitching information for different sets of aggregates, to find out which SCS knows about which aggregates, issue the following GENI tools commands:

For the Production SCS:

 python ~/gcf/src/gcf/omnilib/stitch/scs.py --listaggregates --scs_url http://geni-scs.net.internet2.edu:8081/geni/xmlrpc >scs-prod

Look for the aggregates identified in the earlier steps. For example for the New York switch consolidation effort, the 'listaggregates' function shows that the GPO IG, GPO EG, and NYSERnet IG sites are known to the Production SCS.

For the Test SCS:

python gcf/gcf-current/src/gcf/omnilib/stitch/scs.py --listaggregates --scs_url http://nutshell.maxgigapop.net:8081/geni/xmlrpc > scs-test

Look for the aggregates identified in the earlier steps. For example for the New York switch consolidation effort, the 'listaggregates' function shows that sites GPO IG, GPO EG, GPO OG, NYSERnet IG, iMinds, and Umass are known to the test SCS.

Send a request to:

  • the GMOC to the Production SCS
  • to Xi to update the Test SCS.

5. Validate Updated Stitching

When the updates are completed for all Aggregates and for the SCS servers, testing takes place to verify the URN changes. Validation includes:

  1. Verify Advertisement for AL2s and GENI aggregate that were updated. If the new URN is missing from the stitching section, contact the appropriate aggregate team.
  1. Create stitched slivers with the production SCS that uses each of the rack aggregates that were updated and connect to a remote stitching site. Login in to one node for each sliver and leave some ping traffic running. _DO NOT_ delete these slivers used later in monitoring verification. If Production SCS reports unknown path contact Luke about updating production SCS.
  1. Create stitched slivers with the test SCS, which can be done by using the omni/stitcher option "--scsURL https://nutshell.maxgigapop.net:8443/geni/xmlrpc" that uses each of the rack aggregates that were updated and connect them to a remote stitching site. Login in to one node for each sliver and leave some ping traffic running. _DO NOT_ delete these slivers used later in monitoring verification. If Test SCS reports unknown path contact Xi about updating Test SCS.
  1. Update the GENI aggregate page for GENI Aggregate (http://groups.geni.net/geni/wiki/GeniAggregate/) to capture the new stitching link details.
  1. Review the Operators page to replace any instances of old URN.
  1. Update "http://groups.geni.net/geni/wiki/GeniNetworkStitchingSites#GENIStitchingAL2SPathMappings to replace any modified interface information, see example from Salt Lake update:

Router Interface Site VLAN Range
salt.net.internet2.edu eth7/1 utah-stitch 2100-3499

With the new switch details:

Router Interface Site VLAN Range
salt.net.internet2.edu et-4/3/0 utah-stitch 2100-3499

  1. GENI Monitoring URN Validation. Login into https://genimondev.uky.edu and use the search feature to find all data relating to the new AL2S switch, for example "rtsw.salt.net.internet2.edu". Make sure the following are returned:
    • a switch is listed with the new name "rtsw.salt.net.internet2.edu",
    • interface statistics are available for the new switch,
    • VLAN are being reported for the new switch

  1. Report back about test finding or any outstanding/unresolved issues.

6. Update and close all tickets

Assuming all tests are successful, update and close all tickets by emailing the GMOC. If there are outstanding issues that are significant, leave the ticket open until they are resolved. If there are smaller outstanding issues, close the maintenance tickets, and open new tickets with the appropriate owners to track and resolve, ideally before the next maintenance.