wiki:GENIExperimenter/Tutorials/jacks/HadoopInASlice/ExecuteExperiment

Version 15 (modified by pruth@renci.org, 8 years ago) (diff)

--

Hadoop in a Slice

Image Map

Now that you have reserved your resources, you are ready to login to your nodes and run some Hadoop examples.

1. Configure the hadoop cluster

1.1. Login to Hadoop Master

  1. Login (ssh) to the master using the output of the readyToLogin script or gathering the necessary information from any tool you used for the reservation. The ssh application you use will depend on the configuration of your laptop/desktop.

1.2. (Optional) Look around your node.

  1. Observe the properties of the network interfaces.
    # ifconfig 
    ens3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 10.103.0.17  netmask 255.255.255.0  broadcast 10.103.0.255
            inet6 fe80::f816:3eff:fe31:db4  prefixlen 64  scopeid 0x20<link>
            ether fa:16:3e:31:0d:b4  txqueuelen 1000  (Ethernet)
            RX packets 99237  bytes 12928515 (12.3 MiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 55213  bytes 44478177 (42.4 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.16.1.1  netmask 255.255.255.0  broadcast 172.16.1.255
            inet6 fe80::fc16:3eff:fe00:5b82  prefixlen 64  scopeid 0x20<link>
            ether fe:16:3e:00:5b:82  txqueuelen 1000  (Ethernet)
            RX packets 765818  bytes 1450907680 (1.3 GiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 511355  bytes 42217300 (40.2 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            inet6 ::1  prefixlen 128  scopeid 0x10<host>
            loop  txqueuelen 0  (Local Loopback)
            RX packets 13502  bytes 1189668 (1.1 MiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 13502  bytes 1189668 (1.1 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
  2. Observe the contents of the NEuca user data file. This file includes a script that will install and execute the script that you configured for the VM.
    # neuca-user-data 
    [global]
    actor_id=b0e0e413-77ef-4775-b476-040ca8377d1d
    slice_id=e61547ac-a45d-4228-8483-dfe51945288d
    reservation_id=7163823e-cf26-45ec-960b-a833522d2693
    unit_id=8679cd5d-7cf2-4fea-9f51-6832e6ce0358
    ;router= Not Specified
    ;iscsi_initiator_iqn= Not Specified
    slice_name=urn:publicid:IDN+ch.geni.net:GRW15Illinois+slice+testhadoop
    unit_url=http://geni-orca.renci.org/owl/f3427516-3356-4525-97a6-752df5a7ec68#master
    host_name=master
    management_ip=128.120.83.37
    physical_host=ucd-w4
    nova_id=b7a54b3d-1410-4540-b57f-e5cf4827b3c3
    [users]
    ....
    [interfaces]
    fe163e005b82=up:ipv4:172.16.1.1/24
    [storage]
    [routes]
    [scripts]
    bootscript=#!/bin/bash
    	# Automatically generated boot script
    	# wget or curl must be installed on the image
    	mkdir -p /home/hadoop/
    	cd /home/hadoop/
    	if [ -x `which wget 2>/dev/null` ]; then
    	  wget -q -O `basename http://geni-images.renci.org/images/tutorials/GENI-hadoop/hadoop_config_dynamic.sh.tgz` http://geni-images.renci.org/images/tutorials/GENI-hadoop/hadoop_config_dynamic.sh.tgz
    	else if [ -x `which curl 2>/dev/null` ]; then
    	  curl http://geni-images.renci.org/images/tutorials/GENI-hadoop/hadoop_config_dynamic.sh.tgz > `basename http://geni-images.renci.org/images/tutorials/GENI-hadoop/hadoop_config_dynamic.sh.tgz`
    	fi
    	fi
    	# untar
    	tar -zxf `basename http://geni-images.renci.org/images/tutorials/GENI-hadoop/hadoop_config_dynamic.sh.tgz`
    	eval "/bin/sh -c \"/home/hadoop/hadoop_config_dynamic.sh 2\""
    
  3. Observe the contents of the of the script that was installed and executed on the VM.
    # sudo cat /home/hadoop/hadoop_config_dynamic.sh 
    
    #!/bin/bash
    
    HADOOP_LOG='/home/hadoop/hadoop_boot.log'
    
    echo "Hello from neuca script" > $HADOOP_LOG
    
    ....
    
  4. Test for connectivity between the VMs.
    # ping worker-0
    PING worker-0 (172.16.1.10) 56(84) bytes of data.
    64 bytes from worker-0 (172.16.1.10): icmp_seq=1 ttl=64 time=1.09 ms
    64 bytes from worker-0 (172.16.1.10): icmp_seq=2 ttl=64 time=0.772 ms
    64 bytes from worker-0 (172.16.1.10): icmp_seq=3 ttl=64 time=0.888 ms
    
    # ping worker-1
    PING worker-1 (172.16.1.11) 56(84) bytes of data.
    ...
    etc 
    

1.3 Configure the hadoop filesystem

  1. Switch to the ‘hadoop’ user.
          sudo su hadoop -
    
  2. Format the hadoop filesystem:
    hdfs namenode -format
    
    If all goes well, you should see messages that ends with:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at master/172.16.1.1
    ************************************************************/
    
  3. Start the hadoop services:
          start-dfs.sh
    
          start-yarn.sh
    
  4. Check the status of the Hadoop filesystem. ==
    # hdfs dfsadmin -report
    Configured Capacity: 54824083456 (51.06 GB)
    Present Capacity: 48522035200 (45.19 GB)
    DFS Remaining: 48521986048 (45.19 GB)
    DFS Used: 49152 (48 KB)
    DFS Used%: 0.00%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    
    -------------------------------------------------
    Live datanodes (2):
    
    Name: 172.16.1.10:50010 (worker-0)
    Hostname: worker-0
    Decommission Status : Normal
    Configured Capacity: 27412041728 (25.53 GB)
    DFS Used: 24576 (24 KB)
    Non DFS Used: 3151020032 (2.93 GB)
    DFS Remaining: 24260997120 (22.59 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 88.50%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Thu Sep 17 12:04:32 UTC 2015
    
    
    Name: 172.16.1.11:50010 (worker-1)
    Hostname: worker-1
    Decommission Status : Normal
    Configured Capacity: 27412041728 (25.53 GB)
    DFS Used: 24576 (24 KB)
    Non DFS Used: 3151028224 (2.93 GB)
    DFS Remaining: 24260988928 (22.59 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 88.50%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Thu Sep 17 12:04:32 UTC 2015
    

2. Run the experiment

2.1 Test the hadoop cluster with a small file

  1. Create a small test file
    # echo Hello GENI World > /tmp/hello.txt
    
  2. Push the file into the Hadoop filesystem
    # hdfs dfs -put  /tmp/hello.txt /hello.txt
    
  3. Check for the file's existence
    # hdfs dfs -ls /
    Found 1 items
    -rw-r--r--   2 hadoop supergroup         17 2015-09-17 12:09 /hello.txt
    
  4. Check the contents of the file
    # hdfs dfs -cat /hello.txt
    Hello GENI World
    

2.2 Run the Hadoop Sort Testcase

Test the true power of the Hadoop filesystem by creating and sorting a large random dataset. It may be useful/interesting to login to the master and/or worker VMs and use tools like top, iotop, and iftop to observe the resource utilization on each of the VMs during the sort test. Note: on these VMs iotop and iftop must be run as root.

  1. Create a 1 GB random data set
    #  hadoop jar \
    /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \
    teragen 10000000 /input
    14/01/05 18:47:58 INFO mapred.JobClient: Running job: job_201401051828_0003
    14/01/05 18:47:59 INFO mapred.JobClient:  map 0% reduce 0%
    14/01/05 18:48:14 INFO mapred.JobClient:  map 35% reduce 0%
    14/01/05 18:48:17 INFO mapred.JobClient:  map 57% reduce 0%
    14/01/05 18:48:20 INFO mapred.JobClient:  map 80% reduce 0%
    14/01/05 18:48:26 INFO mapred.JobClient:  map 100% reduce 0%
    14/01/05 18:48:28 INFO mapred.JobClient: Job complete: job_201401051828_0003
    14/01/05 18:48:28 INFO mapred.JobClient: Counters: 6
    14/01/05 18:48:28 INFO mapred.JobClient:   Job Counters 
    14/01/05 18:48:28 INFO mapred.JobClient:     Launched map tasks=2
    14/01/05 18:48:28 INFO mapred.JobClient:   FileSystemCounters
    14/01/05 18:48:28 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
    14/01/05 18:48:28 INFO mapred.JobClient:   Map-Reduce Framework
    14/01/05 18:48:28 INFO mapred.JobClient:     Map input records=10000000
    14/01/05 18:48:28 INFO mapred.JobClient:     Spilled Records=0
    14/01/05 18:48:28 INFO mapred.JobClient:     Map input bytes=10000000
    14/01/05 18:48:28 INFO mapred.JobClient:     Map output records=10000000
    
    After the data is created, use the ls functionally to confirm the data exists. Note that the data is composed of several files in a directory.
  2. Sort the dataset:
    # hadoop jar \
    /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \
    terasort  /input /output
    
    14/01/05 18:50:49 INFO terasort.TeraSort: starting
    14/01/05 18:50:49 INFO mapred.FileInputFormat: Total input paths to process : 2
    14/01/05 18:50:50 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/01/05 18:50:50 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
    14/01/05 18:50:50 INFO compress.CodecPool: Got brand-new compressor
    Making 1 from 100000 records
    Step size is 100000.0
    14/01/05 18:50:50 INFO mapred.JobClient: Running job: job_201401051828_0004
    14/01/05 18:50:51 INFO mapred.JobClient:  map 0% reduce 0%
    14/01/05 18:51:05 INFO mapred.JobClient:  map 6% reduce 0%
    14/01/05 18:51:08 INFO mapred.JobClient:  map 20% reduce 0%
    14/01/05 18:51:11 INFO mapred.JobClient:  map 33% reduce 0%
    14/01/05 18:51:14 INFO mapred.JobClient:  map 37% reduce 0%
    14/01/05 18:51:29 INFO mapred.JobClient:  map 55% reduce 0%
    14/01/05 18:51:32 INFO mapred.JobClient:  map 65% reduce 6%
    14/01/05 18:51:35 INFO mapred.JobClient:  map 71% reduce 6%
    14/01/05 18:51:38 INFO mapred.JobClient:  map 72% reduce 8%
    14/01/05 18:51:44 INFO mapred.JobClient:  map 74% reduce 8%
    14/01/05 18:51:47 INFO mapred.JobClient:  map 74% reduce 10%
    14/01/05 18:51:50 INFO mapred.JobClient:  map 87% reduce 12%
    14/01/05 18:51:53 INFO mapred.JobClient:  map 92% reduce 12%
    14/01/05 18:51:56 INFO mapred.JobClient:  map 93% reduce 12%
    14/01/05 18:52:02 INFO mapred.JobClient:  map 100% reduce 14%
    14/01/05 18:52:05 INFO mapred.JobClient:  map 100% reduce 22%
    14/01/05 18:52:08 INFO mapred.JobClient:  map 100% reduce 29%
    14/01/05 18:52:14 INFO mapred.JobClient:  map 100% reduce 33%
    14/01/05 18:52:23 INFO mapred.JobClient:  map 100% reduce 67%
    14/01/05 18:52:26 INFO mapred.JobClient:  map 100% reduce 70%
    14/01/05 18:52:29 INFO mapred.JobClient:  map 100% reduce 75%
    14/01/05 18:52:32 INFO mapred.JobClient:  map 100% reduce 80%
    14/01/05 18:52:35 INFO mapred.JobClient:  map 100% reduce 85%
    14/01/05 18:52:38 INFO mapred.JobClient:  map 100% reduce 90%
    14/01/05 18:52:46 INFO mapred.JobClient:  map 100% reduce 100%
    14/01/05 18:52:48 INFO mapred.JobClient: Job complete: job_201401051828_0004
    14/01/05 18:52:48 INFO mapred.JobClient: Counters: 18
    14/01/05 18:52:48 INFO mapred.JobClient:   Job Counters 
    14/01/05 18:52:48 INFO mapred.JobClient:     Launched reduce tasks=1
    14/01/05 18:52:48 INFO mapred.JobClient:     Launched map tasks=16
    14/01/05 18:52:48 INFO mapred.JobClient:     Data-local map tasks=16
    14/01/05 18:52:48 INFO mapred.JobClient:   FileSystemCounters
    14/01/05 18:52:48 INFO mapred.JobClient:     FILE_BYTES_READ=2382257412
    14/01/05 18:52:48 INFO mapred.JobClient:     HDFS_BYTES_READ=1000057358
    14/01/05 18:52:48 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=3402255956
    14/01/05 18:52:48 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
    14/01/05 18:52:48 INFO mapred.JobClient:   Map-Reduce Framework
    14/01/05 18:52:48 INFO mapred.JobClient:     Reduce input groups=10000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Combine output records=0
    14/01/05 18:52:48 INFO mapred.JobClient:     Map input records=10000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Reduce shuffle bytes=951549012
    14/01/05 18:52:48 INFO mapred.JobClient:     Reduce output records=10000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Spilled Records=33355441
    14/01/05 18:52:48 INFO mapred.JobClient:     Map output bytes=1000000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Map input bytes=1000000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Combine input records=0
    14/01/05 18:52:48 INFO mapred.JobClient:     Map output records=10000000
    14/01/05 18:52:48 INFO mapred.JobClient:     Reduce input records=10000000
    14/01/05 18:52:48 INFO terasort.TeraSort: done
    
  3. Look at the output: You can use Hadoop's cat and/or get functionally to look at the random and sorted files to confirm their size and that the sort actually worked. Try some or all of these commands. Does the output make sense to you?
    hdfs dfs -ls /input/
    hdfs dfs -ls /output/
    hdfs dfs -cat /input/part-m-00000 | less
    hdfs dfs -cat /output/part-r-00000 | less
    
  4. Use hexdump to see the sorted file. Because the files are binary, it is hard to see the sorted output in ascii. Use xxd:
    hdfs dfs -get /output/part-r-00000 /tmp/part-r-00000
    
    xxd -c 100 /tmp/part-r-00000 | less
    

The out put should look something like the following:

0000000: 0000 02a0 e217 d738 d776 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3833 4639 3045 8899 aabb 3939 3939 3333 3333 4545 4545 3838 3838 3333 3333 3131 3131 4242 4242 3939 3939 3737 3737 3737 3737 4141 4141 3333 3333 ccdd eeff  .......8.v..0000000000000000000000000083F90E....99993333EEEE888833331111BBBB999977777777AAAA3333....
0000064: 0000 0336 98a4 0790 b743 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3038 3937 4232 8899 aabb 3333 3333 3131 3131 3737 3737 3434 3434 3636 3636 4141 4141 4242 4242 3939 3939 4545 4545 3131 3131 4545 4545 4646 4646 ccdd eeff  ...6.....C..000000000000000000000000000897B2....33331111777744446666AAAABBBB9999EEEE1111EEEEFFFF....
00000c8: 0000 04c6 251d e52d 9124 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3842 3330 4438 8899 aabb 4444 4444 3737 3737 4343 4343 3838 3838 3131 3131 3131 3131 4343 4343 4646 4646 3131 3131 3030 3030 3434 3434 3939 3939 ccdd eeff  ....%..-.$..000000000000000000000000008B30D8....DDDD7777CCCC888811111111CCCCFFFF1111000044449999....
000012c: 0000 0630 4115 f4b5 416b 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3745 3534 3439 8899 aabb 3333 3333 4141 4141 4444 4444 4646 4646 4545 4545 3737 3737 3333 3333 3434 3434 3333 3333 4646 4646 4242 4242 4545 4545 ccdd eeff  ...0A...Ak..000000000000000000000000007E5449....3333AAAADDDDFFFFEEEE7777333344443333FFFFBBBBEEEE....
0000190: 0000 06d8 f96c 785c 5054 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3238 3743 4544 8899 aabb 4242 4242 3131 3131 4545 4545 3838 3838 3838 3838 4141 4141 4343 4343 3636 3636 3737 3737 4444 4444 3535 3535 4141 4141 ccdd eeff  .....lx\PT..00000000000000000000000000287CED....BBBB1111EEEE88888888AAAACCCC66667777DDDD5555AAAA....
00001f4: 0000 0768 67de 847c baec 0011 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3030 3443 3032 4231 8899 aabb 4242 4242 3636 3636 3030 3030 4646 4646 3737 3737 4545 4545 3939 3939 4444 4444 3838 3838 3535 3535 3131 3131 3636 3636 ccdd eeff  ...hg..|....000000000000000000000000004C02B1....BBBB66660000FFFF7777EEEE9999DDDD8888555511116666....
0000258

3. Advanced Example

Re-do the tutorial with a different number of workers, amount of bandwidth, and/or worker instance types. Warning: be courteous to other users and do not use too many of the resources.

  1. Time the performance of runs with different resources.
  2. Observe largest size file you can create with different resources.

Introduction

Next: Teardown Experiment