[[PageOutline(1-2)]] = OPS-002-A GENI Security Events = This procedures describes how to handle Security Events, which can be reported by a site contact or an experimenter. This type of event usually __does not__ generate notifications to GENI users. = 1. Issue Reported = GMOC gathers technical details for the incident including: - Requester Organization - Requester Name - Requester email - Requester GENI site-name - Resource information depends on the type of incident being reported. - Problem symptoms and impact to site or GENI. == 1.1 GENI Event Type Prioritization == All security events are `Critical` and require immediate attention. Dispatch is to the rack team who will co-ordinate with site contact, and others involved. == 1.2 Create Ticket == The GMOC ticketing system is used to capture issue information. GMOC may follow up to request additional information as the problem is investigated. The ticket creation operation results in an email notification to the reporter only, others may be added if deemed appropriate. Subsequent updates and interactions between GMOC and reporter will also generate notifications to the issue reporter. = 2. Investigate and Identify Response = == 2.1 Investigate the Problem == Following are several security event cases that may be encountered. === Case 1: Campus IT reports GENI rack involved in ongoing attack === __Event:__ A legitimate campus IT staff member which may or may not be the site contact for the GENI Site reports that the GENI rack is involved in an ongoing attack. __GMOC Tasks:__ 1. Gather details from reporter, write ticket with reporter copied and contacts the proper rack team: * For InstaGENI racks contact GENI-OPS at geni-ops@googlegroups.com * For ExoGENI racks contact ExoGENI-OPS at exogeni-ops@renci.org 2. Iterate with rack team to ensure a resolution is reached and verify with the reporter that the issue is resolved. === Case 2: GENI Control Resources used as part of an attack === __Event:__ A report is made that a GENI rack resource is actively being used in an attack. Furthermore, particular rack control hosts (perhaps ops, boss, foam, or flowvisor hosts) on the rack are identified as vulnerable to this attack. For the purposes of illustration, let's consider an NTP DoS attack. __GMOC Tasks:__ 1. Gather details from reporter, write ticket with reporter copied and contacts the proper rack team: * For InstaGENI racks contact GENI-OPS at geni-ops@googlegroups.com * For ExoGENI racks contact ExoGENI-OPS at exogeni-ops@renci.org 2. Iterate with rack team to ensure a resolution is reached and verify with the reporter that the issue is resolved. 3. Notify other rack teams of vulnerability and ensure that all racks systems are hardened to prevent further incidents. === Case 3: GENI Experiment Resources used as part of an attack === __Event:__ A report is made that a GENI rack experiment resource participated in an attack. Furthermore, based on the information received (perhaps a MAC, VLAN, or public IP address), it appears that a sliver (i.e. a VM) within a slice was responsible. __GMOC Tasks:__ 1. Gather details from reporter, write ticket with reporter copied and contacts the proper rack team: * For InstaGENI racks contact GENI-OPS at geni-ops@googlegroups.com * For ExoGENI racks contact ExoGENI-OPS at exogeni-ops@renci.org 2. Iterate with rack team to ensure a resolution is reached and verify with the reporter that the issue is resolved. === Case 4 Vulnerability reported for OS used by GENI Resources === __Event:__ A report is made about a vulnerability being ound within a particular Operating System that is used in a GENI rack. __GMOC Tasks:__ 1. Gather details from reporter, write ticket with reporter copied and contacts the proper rack team: * For InstaGENI racks contact GENI-OPS at geni-ops@googlegroups.com * For ExoGENI racks contact ExoGENI-OPS at exogeni-ops@renci.org 2. Follow-up with the teams to ensure that a patch is deployed to address the vulnerability on each GENI resource using the OS in question. === Case 5 Unknown Security Incident === __Event::__ A report is made such that it is not applicable to any of the known security event cases. __GMOC Tasks:__ 1. Gather details from reporter, write ticket with reporter copied and contacts the proper rack team: * For InstaGENI racks contact GENI-OPS at geni-ops@googlegroups.com * For ExoGENI racks contact ExoGENI-OPS at exogeni-ops@renci.org 2. Iterate with rack team to ensure a resolution is reached and verify with the reporter that the issue is resolved. 3. Work with other rack teams to determine whether their resources are vulnerable to the same security flaw and to ensure that all racks are hardened to prevent further instances of this security shortcoming. == 2.2 Identify Potential Response == GMOC reviews known security event cases, writes a ticket which copies the incident reporter and dispatches the ticket to rack team. GMOC follows up to verify that problem is resolved for the site which originally showed security flaw. Some coordination may be necessary with other rack team to ensure that the same flaw does not exist in other GENI resources. = 3. GMOC Response = The GMOC may dispatch a problem to other organizations, following is a table of organizations that will provide support listed by area of responsibility: || ''' Team ''' || ''' Area of !Responsibility/Tools''' || || GPO Dev Team || GENI Tools (gcf, omni, stitcher), GENI Portal, GENI Clearinghouse || || RENCI Dev Team || ExoGENI Rack, ExoGENI Stitching || || GENI Operations || InstaGENI Racks || || UKY Operations Team || GENI Monitoring System || || Utah Dev Team || InstaGENI, Jack Tool, !CloudLab, Emulab, Apt|| == 3.1 Implement Response == In this section, the procedure must provides a simple to follow, step by step set of instruction to address the problem, to be also captures are the expected outcome of each step. The GMOC executes the steps outlined. The response implementation may take few iteration, as some attempt may not yield the expected results. GMOC may may have to go back and try further actions in case where new symptoms may occur, or where the procedure is found to be lacking. For both cases, an update to the procedures may be required. Actions should be taken to get the procedures updated. == 3.2 Procedure Updates == When instructions in a procedure are found to miss symptoms, required actions, or potential impact, then action must be taken by the GMOC to provide feedback to enhance the procedure for future use. = 4. Resolution = GMOC verifies the the problem is no longer happening by coordinating with the problem reporter or by checking the tool/log that originally signaled the problem. For scheduled event, the GMOC coordinate with the person that originally scheduled the event to make sure that it was completed successfully. There is also a potential for scheduled event tickets being postponed, and remaining open until the next scheduled time. == 4.1 Document Resolution and Close Ticket == GMOC captures how the problem is resolved in the ticket and closes the ticket. If the problem solution does not fully resolve the problem, a new ticket may be created to generate a new ticket to track the remaining issue. Whether the problem is fully resolved (ticket closed) or partially resolved (new ticket open), both should result in notification back to the problem reporter. For a scheduled event, the ticket may be closed or rescheduled when it cannot be completed in the scheduled time. = Reference: Potential Actions for Security Incidents = This sections provides a set of potential actions to be taken to address the security incident. This section is only a reference and is intended to capture past experiences with security events: === Case 1: Campus IT reports GENI rack involved in ongoing attack === 1. Work with the site admin to select the path of least resistance from the following options: a. Block suspicious traffic from the network (perhaps by ACLs on upstream devices). b. Disconnect the rack from the network (perhaps by configuring ports on the rack switches to be administratively down or disconnecting the physical media). c. Power down the rack. 2. Assuming "b." or "c." was selected above, ensure that the GMOC notifies GENI users that the rack is offline. 3. Use the subsequent case studies as a reference to investigate this attack. === Case 2: GENI Resources used as part of an attack === 1. Work with the reporter of the attack (perhaps the site admin) to determine what test plan was used to identify hosts as vulnerable. 2. If "1" resulted in a test plan, use it to verify the security threat. If no test plan was obtained, develop a test plan to detect the vulnerability. Below is an example for the NTP DoS attack: - Identify which rack hosts are listening on UDP port 123 (i.e. the NTP port) by running nmap. The nmap "STATE" for these will be anything but "closed". - Run the sysstat cmd against all these hosts to identify which hosts are replying to rogue sources. 3. Work with the rack team in question to harden the resulting hosts on all racks. === Case 3: GENI Experiment Resources used as part of an attack === 1. Assuming you've got an administrator account at the rack, do the following: - Log in through the Emulab web portal (For example, assuming NPS was the site in question, the web portal is "instageni.nps.edu"). - Click on the "green" dot next to the search bar in order to switch to administrative privileges. - Click on "Administration" and from the drop-down menu, select "!ProtoGeni Slices". - In the resulting search bar, insert the MAC, IP, or VLAN ID in question. - Assuming your search is successful, click on the link below the resulting "HRN". - From the new page, obtain the contact information of the slice owner. - Contact the slice owner with the particulars of the attack (CCing the respective rack team). - Work with the slice owner and rack team on a resolution. 2. If you don't readily have an administrator account, contact the respective rack team with the particulars. The rack team will subsequently contact the slice owner in an effort to resolve the issue.