GENICloud Project Status Report

Period: November 18, 2010 - March 18, 2011

I. Major accomplishments

Built (with the collaboration of the iGENI project (Joe Mambretti and Tom DeFanti) and the GLab project (Paul Mueller) the TransCloud, a transcontinental cloud operating at four sites over two continents, connected by 10 Gb/s connections over the CaveWAVE, National Lambda Rail, and the Global Lambda Interchange Facility (GLIF). Network connectivity is due to the facilities at StarLight, NetherLight, and DFM. The TransCloud became operational February 15, 2011, and is open for use by GENI, FIRE, and GLab researchers.

Demonstrated TransCloud at GEC-10, including a transcoding service and a prototype query system across distributed clusters.

A. Milestones achieved

Deployed and operated an SFA-compliant multi-site cloud infrastructure

B. Deliverables made

Developed SFA-comnpliant Eucalyptus-based cloud manager, and added support in Eucaltyptus for SFA 2.0

II. Description of work performed during last quarter

A. Activities and findings

  1. Designed and implemented virtual Hadoop cluster over the wide-area; specifically, each cluster component ran on a VM, with VMs residing in physical hosts from HP Palo Alto (OpenCirrus), Northwestern University and Kaiserslautern (University collaborators).
  2. Built a proof of concept distributed query extension to the Pig declarative query language. Pig was developed for generating mapreduce jobs on Hadoop for the querying of data, and this extension allows for the declaration of multiple sites for the processing of data, along with a site to collate those results.
  3. Ran brief performance test using different VM images and found that pure CPU performance (not just Hadoop specific) for each VM is highly dependent on the particular image, even if they fit the typical partitioning of medium, large and extra-large defined by Eucalyptus.
  4. Developed a sub-virtual machine isolation layer for Cloud Programming, based on Google Native Client (NaCl) and the Seattle project's Restricted Python (RePy)
  5. Brought up and installed a cluster monitoring infrastructure on each TransCloud cluster, based on the Ganglia cluster monitoring system
  6. Designed developed, and demonstrated a monitoring and visualization system for the progress of Hadoop jobs
  7. Secured the domain name trans-cloud.net so that TransCloud slivers will be contained in this name domain.

B. Project participants

HP

UCSD

Princeton

University of Victoria

C. Publications (individual and organizational)

  1. Chris Matthews, Justin Cappos, Yvonne Coady, John Hartman, Jonathan Jacky and Rick McGeer, "NanoXen : Better Systems Through Rigorous Containment and Active Modeling", OSDI 2010 (Poster).
  2. Rick McGeer, Alvin AuYoung, Andy Bavier, Jessica Blaine, Yvonne Coady, Joe Mambretti, Chris Matthews, Chris Pearson, Alex Snoeren, Marco Yuen, "TRANSCLOUD:: Design Considerations for a High-Performance Cloud Architecture Across Multiple Administrative Domains", Proceedings CLOSER, 2011
  3. Chris Matthews, Justin Cappos, Yvonne Coady, John Hartman, Jonathan Jacky and Rick McGeer, "NanoXen : Better Systems Through Rigorous Containment and Active Modeling", Proceedings SAVCBS, 2010.

D. Outreach activities

E. Collaborations

  1. iGeni, Prof. Joe Mambretti, Prof. Tom DeFanti
  2. Prof. Michael Zink, !UMass (provided data repository for Hadoop application)
  3. Seattle project, Dr. Justin Cappos (collaborates on Cloud programming environments)
  4. PlanetLab, Prof. Larry Peterson

Other (non-GENI) collaborators

  1. Mathematics and Information Technology Applications in Complex Systems (MITACS), Government of Canada (co-sponsor). Duncan Phillips, collaborator contact
  2. G-Lab, Prof. Paul Mueller
  3. University of Amsterdam, Prof. Cees de Laat (provides connectivity and will join TransCloud as a site)
  4. VICCI, Prof. Larry Peterson

F. Other Contributions