wiki:GENIExperimenter/Tutorials/jacks/HadoopInASlice

Version 3 (modified by nriga@bbn.com, 4 years ago) (diff)

--

We are actively updating this tutorial. If you need help with this tutorial please contact: help@geni.net

Distributed Computing on GENI: Hadoop in a Slice

Experiment Overview

In this tutorial you will create a slice composed of three virtual machines that are a Hadoop cluster. The tutorial will lead you through creating the slice, observing the properties of the slice, and running a Hadoop example which sorts a large dataset.

After completing the tutorial you should be able to:

  1. Use GENI to create a virtual distributed computational cluster.
  1. Create simple post boot scripts with simple template replacement.
  1. Use the basic functionality of a Hadoop filesystem.
  1. Observe resource utilization of compute nodes in a virtual distributed computational cluster.

Prerequisites

In order to participate in this tutorial you need to have an account at an InCommon institution or from the GPO. If you haven't done so before, please sign in to the GENI Portal.

Make sure you know which institution will provide you access to GENI and the username and password you need to authenticate. If you don't know, please let us know.

It may be helpful for you to have access to your email.

In addition, this tutorial assumes you have experience using the GENI portal and Flack to create and log into virtual machines.

Tools

GENI Experimenter Portal

Where to get help

geni-users@googlegroups.com

Resources

  • Three virtual machines within a single ExoGENI rack.

Tutorial Instructions

Design/Setup
Execute
Finish