wiki:sppReport-1Q09

Version 12 (modified by Jon Turner, 15 years ago) (diff)

--

Internet Scale Overlay Hosting Progress Report (1/1/2009-3/31/2009)

Project Activities this Quarter

Upgrading of Software for System Initialization

The PlanetLab project made significant changes to the software that manages the booting and initialization of PlanetLab nodes. Since the SPP relies on these mechanisms as well, we have had to make some changes to our own software to accommodate the changes made by PlanetLab. As context, when a PlanetLab node comes up, it contacts PlanetLab Central (PLC) to obtain a boot image and a variety of configuration data. Since the GPEs in an SPP run the PlanetLab OS, they also attempt to contact PLC when they boot. However, the SPP is designed to look to the external world, like a single PlanetLab node, so it would be inappropriate to let the GPEs contact PLC directly. To avoid making unnecessary changes to the GPE (which would make it difficult for us to track future changes to the PlanetLab code), we have implemented a software module running on the SPP's control processor that intercepts the connections between the GPEs and PLC, hiding the internal structure of the SPP from PLC, while still getting the GPEs the information they need to complete their initialization. We have implemented and tested the changes needed to this module to allow us to stay consistent with PlanetLab.

Integration of Slice Login Manager (SLM)

The SLM is a software module running on the SPP's Control Processor that enables remote users to login to their slices without having to be aware of the specific GPE on which their slice's vServer is running. Incoming SSH connections are directed by the SPP Line Card to the SLM, which deals with authentication, determines which GPE is hosting the user's slice and forwards the connection to that slice. Once the login procedure is complete, packets flowing between the user and its vServer are forwarded transparently by the CP (at the OS level), without further involvement by the CP.

Software for Remote Display of Real-Time Performance Data

The NPE in the SPP provides an extensive collection of performance monitoring counters. These are maintained by the datapath software and are accessible to the xScale management processor. We have developed software to allow a remote user to monitor these counters and have the values displayed on remote real-time charts, allowing researchers to observe what is happening "under the covers" as traffic flows through their fastpath running on the NPE. The software consists of three components. First, there is a user interface implemented as a java application that the user runs on a remote machine. This interface communicates with a monitoring daemon that is run within the user's vServer. This daemon accesses the performance data using an API provided by the Resource Manager Proxy (RMP), which runs in the root context on each GPE. The RMP obtains the required data through a software module running on the xScale processor in the NPE.

Preparation for First Public Demonstration at GEC 4

We are planning our first public demonstration of the SPPs at GEC 4. This demonstration will involve two fully configured SPP nodes and will demonstrate all the control mechanisms needed to run an application. This will include the following elements: (1) download slice definitions from PLC and instantiate slices on SPP nodes, (2) user login to slice's vServer and establishment of connections from vServer to a remote machine to download files, (3) configuration of external ports that remote machines can use to connect to programs running within the slice's vServer, (4) configuration of a fastpath on the NPE, including configuration of logical interfaces, routes, filters, queues, buffers and network bandwidth, (5) running of application traffic from remote machines through an overlay network consisting of three logical nodes on two different SPPs and (6) configuration and operation of performance monitoring software, demonstrating real-time remote display.

Development of SPP Reservation System (work in progress)

We have designed and begun the implementation of a reservation system that will allow users to reserve SPP resources in advance. Currently, the SPP software supports an "immediate allocation" model of resource management. This allows users to request needed resources "right now" and ensures that finite resources are not over-booked, but does not allow users to reserve resources in advance. We are in the process of implementing a reservation system, to allow users to make advance reservations on each SPP node. Reservations will be made through an API that will be available to any program within a slice's vServer. This API will allow reservation of bandwidth on Line Card interfaces as well as NPE fastpaths and associated resources (bandwidth, filters, queues, packet buffers, memory). The reservation system will enforce policies to prevent any slice from hogging all the resources (e.g. reservations can be made at most two weeks in advance and no slice can have reservations totaling more than 10 hours on a given SPP). A command line utility will be provided to read a reservation request from a file and invoke the API to reserve the requested resources. Implementation of the reservation system requires changes to the Resource Manager Proxy (RMP) (which runs on the GPEs and will implement the API) and the System Resource Manager (SRM) (which runs in the CP). It will also require changes to the Substrate Control Daemons in the Line Card and NPE. The new interfaces among the various software components have now been designed and much of the supporting code has been written. Once these interfaces have been completed and tested, we plan to proceed with the actual reservation logic (enforcing that overlapping future reservations do not use more resources than are available). In addition, we need to add mechanisms to save reservations to disk and read them into the SRM during system initialization, so that reservations are preserved across CP crashes.

Facilitate Communication and Interaction Among Cluster B Participants

We have been working to foster interaction among Cluster B participants by organizing monthly phone conferences involving the PIs. These calls provide PIs an opportunity to report on status and share experiences. They also provide a vehicle through which GPO staff can stay in touch with PIs and communicate issues of concern to the GPO. While the level of interaction is currently somewhat limited, we expect more substantive interactions to develop over time. We also participated in a Cluster B integration meeting in Denver, which led to an agreement on a simpler model for interaction between GENI Clearinghouse components and Aggregate Managers.

Deployment Planning

We have been working to solidify plans for deployment of the first two GENI nodes within Internet 2. Progress on this front has been distressingly slow, in large part because Internet 2 appears ill-prepared to actually deliver on the resource commitments made to the GENI project, and I2 staff has been slow to respond to email discussions attempting to move the agenda forward. While the situation is not critical yet (we will not be ready to actually ship SPP nodes to I2 sites before summer), it is troubling that these issues are taking so long to resolve. After six months, we are no closer to resolving these issues than we were when the project started in October of 2008.

System Integration/Bug Fixes

As with any substantial system project, this one has experienced a number of issues as we have gone through the system integration process. Most of these were ultimately traceable back to software bugs that did not become apparent until various system components were put together and exercised in realistic usage scenarios. For example, we have experienced a several issues relating to the Network Address Translation (NAT) module implemented in the Line Card. NAT requires coordinated activities within the Line Card data path, the NAT daemon running on the xScale and the System Resource Manager (SRM), which runs on the CP. This creates opportunities for subtle errors, which only become apparent when the system is being used. We have also experienced issues with the Shelf Manager, which provides low level control of the ATCA chassis that hosts most of the system components (for example, the shelf manager can be used to power-off or reset individual boards). These issues involve conflicting control actions by the Shelf Manager and the board that implements the chassis switch. We have worked with the vendor of the chassis switch (Radisys) to identify the problem and turn-off the software component on the chassis switch that causes the problem, but do not yet have a satisfactory resolution. Issues of this sort consume time and effort, but we are working through them and expect to have most, if not all of them, resolved over the next few months.

Milestones achieved

None yet.

Deliverables made

None yet.

Project participants

Jon Turner – PI
Patrick Crowley – PI
John DeHart – technical staff
Fred Kuhns – technical staff
Dave Zar – technical staff
Ken Wong – technical staff
Mike Wilson – graduate student
Mart Haitjema – graduate student
Ritun Patney – graduate student

Publications (individual and organizational)

None yet.

Outreach activities

None yet.

Collaborations

Have begun discussions with OpenFlow group at Stanford on a demonstration of OpenFlow in a slice, enabling wide-area, multi-domain OpenFlow networks.

Have begun discussions with Larry Peterson of Princeton on how to best adapt the GENI wrapper being developed at Princeton for use with the SPPs.