= NetKarma Report 07/01/2010-11/05/2010 (GEC9) = PI: Beth Plale, School of Informatics and Computing, Indiana University Bloomington Co-PI: Chris Small, Global NOC, Indiana University Summary * Documents: Experience Report, User Doc, Prototype * Karma 3.1.1 released == 1. Major Accomplishments == Now hosting persistent netKarma service. Accepting GUSH log files from community. Please see http://pti.iu.edu/d2i/provenance_netkarma === A. Milestones Achieved === S2.f Experience Report for NetKarma Project: This report summarizes our experiences in provenance capture in GENI. It is available from the netKarma GENI Wiki page as [http://groups.geni.net/geni/attachment/wiki/netKarma/NETKARMA-S2.f-experience-report.pdf] S2.g User Doc: The user guide describes the NetKarma Provenance Collection tool and give assistance to developers using the persistent Axis2 Web Service on how one uses the XML API to interact with the NetKarma repository. The user guide is available here: [http://groups.geni.net/geni/attachment/wiki/netKarma/NETKARMA-S2.g-UserDoc.pdf] S2.h Prototype: During GEC9, the netKarma team demonstrated live the execution of a !MapReduce graph coloring algorithm running on in a PlanetLab slice utilizing a set of nodes. Provenance from GUSH is captured, stored to the netKarma repository, and retrieved through the XML query API. A description can be found at: [http://groups.geni.net/geni/attachment/wiki/netKarma/NETKARMA-S2.h-Prototype.pdf] === B. Deliverables Made === Experience Report, User Doc, Prototype == 2. Description of Work Performed Since Last Report == === A. Activities and findings === 1. Karma 3.1.1 Released: The new release of the core Karma system utilizes RabbitMQ enterprise messaging system for events ingest. The provenance data is efficiently stored in a relational database, and supports Open Provenance Model (OPM) v1.1 standard for interfacing with the tool. Karma is available at http://pti.iu.edu/d2i/provenance_karma 2. Engaged with Instrumentation and Measurement Working Group. Talked at length with Larry Lannom about provenance, its representation, and how it can fit into the repository scheme has and his team envision. 3. Worked with Luisa Nevers of BBN to run netKarma on a GUSH script she generated. After much back and forth, it was determined that her mode of running GUSH (through the command line versus using the XML file) generated a different log file and this was causing our netKarma Adaptor which parsed the log file to have problems. Resolved over Sep timeframe. 4. Demo for GEC9:. Successful live demonstration of a MapReduce graph coloring algorithm running on in a PlanetLab slice utilizing a set of nodes. Provenance from GUSH is captured, stored to the netKarma repository, and retrieved through the XML query API. Examined other application but discarded. Thought to use GUSH to install Codeen and Coblitz, as these are good parallel applications that run on PlanetLab, but determined that these come preinstalled on PlanetLab so do not exercise GUSH as we need. 5. Rewrite code for netKarma Adaptor so it no longer creates code on the fly. This makes the code easier to maintain and build. 6. Installed version of Karma on the GMOC machine. This is the persistent netKarma service. 7. Successful build of GUSH from the source code. === B. Project Participants === During this time, key participants in the NetKarma project included: Beth Plale, PI[[BR]] Chris Small, Co-PI [[BR]] Mehmet Aktas, Postdoctoral Fellow [[BR]] Devarshi Goshal, PhD student [[BR]] Peng Chen, PhD student [[BR]] You-Wei Cheah, PhD student [[BR]] David Ripley, Technical Staff [[BR]] Robert Ping, Project and Information Management === C. Publications & Documents === GEC 9 poster GEC9_Poster_final.pdf NetKarma Status Update during GEC8: http://groups.geni.net/geni/attachment/wiki/netKarma/netKarma-update-20100720.pdf NetKarma Poster used at the GEC8 demo session: http://groups.geni.net/geni/attachment/wiki/netKarma/GEC8_IU_NetKarma%20Poster8x11.pdf Spiral 2 Annual Review Slides: http://groups.geni.net/geni/attachment/wiki/netKarma/Spiral2ProjectReview_Netkarma-27Aug2010-2.pptx NetKarma Provenance Repository Research Poster presented at GEC9: http://groups.geni.net/geni/attachment/wiki/netKarma/GEC9_Poster_final.pdf === D. Outreach Activities === === E. Collaborations === Engaged with Instrumentation and Measurement Working Group. Talked at length with Larry Lannom about provenance, its representation, and how it can fit into the repository scheme has and his team envision. From the GRNOC the main information on provenance we will get is normalized data from each of the control clusters. The GMOC has already done much of the work of rectifying divergent data sets and placing it the GMOC database. The GMOC is also collecting status and topologies of the substrate and increasingly more views into the slice level of individual experiments. Planned Activities === F. Other Contributions === During the Annual review and subsequent communication with GENI program manager Vic Thomas, new milestones were created for the coming year to include: Milestone a. GEC9 demonstration and outreach. Due 11/5/2010 • NetKarma demonstration o provenance collected from runs of GUSH and retrieved from a persistent netKarma server deployed at the GNOC at Indiana University o Obtain feedback from experimenters on the kinds of provenance information that will be useful o Identify a new source of provenance information : Global Research Network Operations Center Milestone b. Plan for making provenance information available to experimenters. Due 1/7/2011 • Document or wiki page with plan for how provenance information will be provided to experimenters. • Plan for how provenance source from Milestone a will be used Milestone c. GEC10 demonstration and outreach. Due 3/5/2011 • Demonstration of a GENI experiment and display of provenance information for data collected by experiment. Demonstration should include at least one new source of provenance information : GNOC • Get feedback from experimenters on the kinds of provenance information that will be useful • Identify at least one other source of provenance information Milestone d. Updated plan for making provenance information available to experimenters. Due 4/15/2011 • Document or wiki page updates on how provenance information is provided to experimenters. • Description of how additional source of provenance information identified in Milestone c will be used Milestone e. GEC11 demonstration and outreach. Due July 2011 • Demonstration of an GENI experiment and display of provenance information for data collected by experiment. Demonstration should include at least one new source of provenance information • Get feedback from experimenters on the kinds of provenance information that will be useful • Identify at least one other source of provenance information Milestone f. Deliver software and documentation. Due 8/26/2011 • Documentation for experimenters on how to collect and use provenance information • NetKarma software and documentation • Description of how additional source of provenance information identified in Milestone e will be used.