[[PageOutline]] == Project Number == 1706 == Project Title == netKarma: GENI Provenance Registry[[BR]] a.k.a. NETKARMA === Technical Contacts === '''PI:''' Beth Plale [mailto:plale@cs.indiana.edu][[BR]] Chris Small [mailto:chsmall@cs.indiana.edu][[BR]] === Participating Organizations === [http://www.informatics.indiana.edu/ School of Informatics and Computing][[BR]] Indiana University Bloomington[[BR]] === GPO Liaison System Engineer === [mailto:vthomas@geni.net Vic Thomas] == Scope == The project collect and represents provenance for experiments conducted on the GENI platform. The provenance of an experiment is relevant information collected at experiment plane, control plane, and measurement plane. The provenance results will be available in a GENI Provenance Registry (netKarma) but can also be used to augment other collection mechanisms, for instance at the instrument level. [http://pti.iu.edu/d2i/provenance_netKarma NetKarma] is based on [http://pti.iu.edu/d2i/provenance Karma], a provenance collection and representation service that has been used to collect provenance in diverse applications including satellite imagery pipeline (NASA funded), Linked Environments for Atmospheric Discovery (LEAD, NSF funded), and the Life Science Grid (Lilly Corp. funded). === Current Capabilities === The NetKarma Provenance Repository is an Axis2 Web service that resides on a server in the GENI Meta-Operations Center (GMOC), located at Indiana University. It has a WSDL access API so provenance can be retrieved programmatically. The persistent Axis2 web service is available at http://netkarma.testlab.grnoc.iu.edu:8080/axis2/services/KarmaService. The persistent Web Service provides both publish and query API to interact with the provenance repository to capture and browse provenance of GENI experiments. The current capabilities include the ingestion of provenance information from the Gush experiment control tool. The NetKarma project produced the GENI Adaptor software, which parses Gush log files to obtain provenance information. Further information about the GENI Adaptor software is available at http://pti.iu.edu/d2i/provenance_netkarma. With such capabilities, the persistent Web Service will enable GENI experimenters to capture data about experiments including: time ordering and relationships within the experiment, changes made between runs, and relationships between the experiment and control framework. The current capabilities also include the ability to query the persistent Karma Service for provenance of GENI experiment executions. In turn, this will enable GENI experimenters show such provenance information in visualization such as Google Earth View. We Need Your Help! :: Give us your GUSH logs. We'll mine them for provenance data and drop the provenance into the netKarma provenance repository. You'll get a message back with information that you can use to query for the graph of your GUSH run. Submit through (http://pti.iu.edu/d2i/provenance/submit-gush-log). We're working on customizations to the cytoScape tool that you can download to visualize your provenance. Planned capacities: The project team is currently working to process logs from the Raven provisioning service to obtain provenance information. This new capability will enable GENI experimenters capture provenance information such as experiment node locations, time of deployment of software packages and versions of software deployed. The planned capabilities also include actively gathering information from the GMOC database. This will help the GENI experimenters relate the provenance on-the-fly to network measurements. All this information will be accessible through the persistent axis2 Web service layer to the community. Because provenance collection captures details about the experiment, it raises issues of privacy that we address by giving the experimenter control of when collection occurs. Another planned capability is the visualization of the provenance graph generated from Gush experiment log files using the CytoScape visualization tool (http://www.cytoscape.org). === Milestones === [[MilestoneDate(NETKARMA: S2.a overview)]][[BR]] [[MilestoneDate(NETKARMA: S2.b webpage)]] [http://www.dataandsearch.org/provenance/?q=node/33 Click here for project web page][[BR]] [[MilestoneDate(NETKARMA: S2.c identify tools)]][[BR]] [[MilestoneDate(NETKARMA: S2.d demo)]] [attachment:"Netkarma_poster_gec7.pdf" Demo Poster] [[BR]] [[MilestoneDate(NETKARMA: S2.e identify partners)]]See Q3 quarterly report for discussion.[[BR]] [[MilestoneDate(NETKARMA: S2.f experience report)]][attachment:"NETKARMA-S2.f-experience-report.pdf" Experience Report][[BR]] [[MilestoneDate(NETKARMA: S2.g user doc)]][attachment:"NETKARMA-S2.g-UserDoc.pdf" User Guide][[BR]] [[MilestoneDate(NETKARMA: S2.h prototype)]][attachment:"NETKARMA-S2.h-Prototype.pdf" Prototype][[BR]] [[MilestoneDate(NETKARMA: S3.a Demonstration and Outreach at GEC9)]] [[BR]] [[MilestoneDate(NETKARMA: S3.b Plan for making provenance information available to experimenters)]] [attachment:NetKarma-S3.b.plans.pdf Plan] [[BR]] [[MilestoneDate(NETKARMA: S3.c Demonstration and Outreach at GEC10)]] [[BR]] [[MilestoneDate(NETKARMA: S3.d Updated plan for making provenance information available to experimenters)]] [[BR]] [[MilestoneDate(NETKARMA: S3.e GEC11 demonstration and outreach)]] [[BR]] [[MilestoneDate(NETKARMA: S3.f Deliver software and documentation)]] [[BR]] == Project Technical Documents == [http://www.dataandsearch.org/provenance/?q=node/33 NetKarma project page] maintained by the [http://www.pti.iu.edu/d2i Data to Insight Center] [attachment:"NetKarma_GEC7 deliverable Plale-Small.pdf" netKarma: a tool for obtaining a provenance-based record of experimentation] (overview document)[[BR]] === Software === Code to allow capture of provenance from GUSH [http://pti.iu.edu/d2i/provenance_netkarma/gush2netKarma-1.0.tar.gz gush2netkarma-1.0.tar.zip]. This code does not store events to the Karma system; it simply writes the provenance to a file that is then used to create a visual graph. === Presentations and posters === [attachment: "GEC9_Poster_final.pdf" GEC9 poster][[br]] [attachment: "GEC8_IU_NetKarma Poster8x11.pdf" GEC8 poster][[br]] [attachment: "netKarma-update-20100720.pdf" GEC8 update][[br]] [attachment:"NetKarma_PL_cluster_GEC7_slides.pdf" netkarma Presentation given at GEC7 Planetlab cluster meeting] [[BR]] [attachment:"Netkarma_poster_gec7.pdf" GEC7 poster][[BR]] [attachment:"NetKarma_Poster.pdf" GEC6 poster][[BR]] === Quarterly Status Reports === [wiki:netKarma-4Q09-status 4Q09 Status Report][[BR]] [wiki:netKarma-1Q10-status 1Q10 Status Report][[br]] [wiki:netKarma-2Q10-status 2Q10 Status Report][[br]] [wiki:netKarma-GEC9 Status Report] === Spiral 2 Connectivity === This project will use existing Indiana University address and connections. No addition connectivity is needed as initial data sources are globably availible on the Internet or over R&E networks. === Related Projects === Initial integration GushProto [[BR]] ProvisioningService (Raven) [[BR]] Possible future integration [wiki:PGTools PGTools] [[BR]] DigitalObjectRegistry [[BR]] [wiki:GENIMetaOps GMOC] [[BR]] OnTimeMeasure [[BR]]