Version 36 (modified by, 7 years ago) (diff)


Hadoop in a Slice

We are actively updating this tutorial. If you need help with this tutorial please contact:

Part I: Obtain Resources: create a slice and reserve resources

Image Map

1. Establish the Environment

1.1 Pre-work: Ensure SSH keys are setup

Verify that you have at least one public key associated with your account. To do that, after you login to the portal check under your Profile, under the SSH keys tab. If you do not have SSH keys associated yet, please follow the instructions on that tab of the Portal.

1.2 Configure Omni

If you have not installed and configured omni yet, please follow these instructions.

2. Obtain Resources

2.1 Create a slice Slice

Create a slice using omni and the slice name of your choice. From now on that slice name will be referred to as SLICENAME.

$ omni createslice SLICENAME 

2.2 Create your RSpec

We are going to use the graphical interface of the portal (Jacks) to create the RSpec file for this tutorial but we are going to use Omni for reserving the resources.

2.2.1 Load a simple topology in Jacks

  1. In the Portal, open the Slice page for the slice you just created. Notice that you created the slice with omni and it is available via the Portal.
  2. Press the Add Resources button to launch Jacks for this slice.
  3. From the Choose RSpec menu (see figure), select the URL button.
  4. Enter the URL for the RSpec:
    then click Select.
  5. After you click Select, a couple of nodes will appear.

The topology you loaded has two node-types: master and worker. Let's see what each node-type is comprised of:

  1. Inspect the properties of the hadoop master. Note that not only the usual attributes (e.g. node name, node type) are set, but there are also custom OS Image, install and execute attributes. These attributes is what customizes a generic GENI node a specialized one like a hadoop master.

2.2.2 Create the topology(RSpec) for the experiment

Now we will create our Hadoop cluster. For the cluster we need:

  • 1 hadoop master
  • 2 or more workers
  • all the nodes need to be on the same Layer 2 Domain and IP Subnet (
So let's go ahead and draw our topology:
  1. Select the worker node.
  2. Press the "Duplicate Nodes only" button.
  3. Repeat the above two steps so that you have at least three workers but no more than six (please be respectful to other and limit your cluster size, especially if you are doing this in the context of a tutorial).
  4. Delete the original worker node (select it and on the attributes on the left click the Delete button).
    DO NOT press the delete/backspace key since this is equivalent to pressing the Back button on the browser.
    Delete worker node
  5. Ensure that all your workers are named in the pattern `worker-i` with i=0,1.2,... (e.g. `worker-0, worker-1, worker-2`)
    WARNING This step is important, if the workers are not named based on the above convention your cluster WILL NOT be configured correctly
  6. On the master node make sure that on the Execute script the input number matches the number of workers, e.g. if you have 3 workers the entry should be
    /home/hadoop/ 3
  7. Connect all your nodes in one LAN
    1. Drag a line between two of your nodes
    2. From each unconnected node draw a line from that node to the middle of the link where there is a small square
  8. Assign IPs to your nodes:
    Click on the small square in the middle of your lan and set the IPs and netmask for each interface according to this list:
    • the netmask for all interfaces should be
    • master IP:
    • worker-i IP: 172.16.1.<10+i>, e.g. worker-0:, worker-1:, worker-5:, etc
      WARNING: Make sure the IPs are assigned according to the above pattern, your cluster will not be configured correctly otherwise
  9. Inspect your topology to ensure that all the configurations are correct and Download your rspec.
    NOTE: The rspec will be saved in your default dowload folder (usually ~/Downloads) under a name similar to slicename_request_rspec.xml

2.3. Reserve your resources

You can use any tool to reserve this topology, today we are going to use Omni. To do that you will need to:

  1. Submit the request by running this omni command in a command line window:
    omni createsliver slicename rspec_filename -a AM_NICKNAME
  2. Wait until the slice is up by periodically running and checking the output of the command:
    readyToLogin slicename --useSliceAggregates


Next: Execute the Hadoop Experiment

Attachments (24)