wiki:GENIOperationsTrial/WimaxDataplaneebugging

OPS-006-B GENI WiMAX Dataplane Debugging

This procedure defines the debugging steps to identify and remedy WiMAX Multipoint VLAN issues at GENI wireless sites. WiMAX issues may be reported by an experimenter, by the site Contact or by the GENI Monitoring System. Regardless of the source for the reported event, a ticket must be written to handle the investigation and resolution of the problem. Ticket must copy the issue reporter, the WiMAX development team at wimax-developer@winlab.rutgers.edu and the GENI Experimenters at geni-users@googlegroups.com.

1. Issue Reported

GMOC gathers technical details for failures including:

  • Requester Organization
  • Requester Name
  • Requester email
  • Requester GENI site-name
  • Slice Name, any site sliver details available
  • The AL2S VLAN id of the Multi-point VLAN at the requesting site.

1.1 GENI Event Prioritization

GMOC should classifies a WiMAX Dataplane failure as High priority. This type of issue may be deemed Critical if the person reporting the issue identifies it as Critical. For example if the issue impacts a demo, a training session, or a conference.

1.2 Create Ticket

The GMOC ticketing system is used to capture information above. GMOC may follow up to request additional information as problem is investigates. This operation results in the problem reporter getting a ticket email for the issue reported and for subsequent interactions between GMOC and reporter.

2. Investigate and Identify Response

2.1 Investigate the Problem

WiMAX Multi-point VLAN failure symptoms exhibit the following symptoms:

  • Experimenter is unable to exchange traffic on WiMAX multi-point VLAN at WiMAX Aggregate.
  • WiMAX site administrator notices failure, or link down, for this VLAN on multi-point VLAN edge nodes/

2.2 Identify Potential Response

Start by trying to spot the traffic from the middle and working your way towards the problem. Get as much information as you can on your own using tools, or devices which you have access to. Tools include: GENI Monitoring System.

Isolate the breakage(s) to segments that you don't have visibility into. From there, you will need to work with operators of WiMAX aggregates and intermediary networks. You will want to have them:

  1. Try to verify that all switches in the path of the affected slice are up.
  2. Trace MAC addresses through the path by checking the MAC learning tables in the switches.

Note that in some VLANs, MAC address learning has been disabled throughout the path, and they will need to temporarily enable MAC address learning on those VLANs.

3. GMOC Response

The GMOC implements the actions identified in the procedure response section and updates the ticket to capture actions taken.

3.1 Implement Response

The GMOC executes the steps outlined. If GMOC finds procedure to be lacking, then steps should be taken to get the procedures updated.

3.2 Procedure Updates

When instructions in a procedure are found to miss symptoms, required actions, or potential impact, then action must be taken by the GMOC to provide feedback to enhance the procedure for future use.

4. Resolution

GMOC verifies the the problem is no longer happening by coordinating with the problem reporter or by checking the tool/log that originally signaled the problem.

4.1 Document Resolution and Close Ticket

GMOC captures how the problem is resolved in the ticket and closes the ticket. If the problem solution does not fully resolve the problem, a new ticket may be created to generate a new ticket to track the remaining issue.

Whether the problem is fully resolved (ticket closed) or partially resolved (new ticket open), both should result in notification back to the problem reporter.

Last modified 9 years ago Last modified on 06/18/15 08:13:37