Changes between Version 3 and Version 4 of GENIExperimenter/Tutorials/GettingStarted_PartII_Hadoop/Procedure/Execute


Ignore:
Timestamp:
03/07/14 14:48:10 (10 years ago)
Author:
sedwards@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GENIExperimenter/Tutorials/GettingStarted_PartII_Hadoop/Procedure/Execute

    v3 v4  
    5757already started, and the apache webserver has started on the server and measurements are now being collected.
    5858
    59 {{{
    60 #!html
    61 <table>
    62 <tr><td>
    63 <ol type='a'>
    64 <li>
    65 Enter the hostname of the server node in your browser.  It should bring up a webpage of statistics from your experiment similar to that shown in the figure.
    66 </li>
    67 </ol>
    68 </td>
    69         <td>
    70         <img
    71       src="http://groups.geni.net/geni/attachment/wiki/GENIEducation/SampleAssignments/UnderstandAMAPI/Graphics/webserver_v1.png?format=raw"
    72       alt="Enter the hostname in your browser to see statistics"  width="300"
    73       title="Enter the hostname in your browser to see statistics" /> </a>
    74          <br/>
    75          <b>Figure 5-1</b> Enter the hostname of the <i>server</i> node in your browser to see statistics</i>
    76        </td>
    77 </tr>
    78 <tr>
    79 <td>
    80 <ol type='a' start='2'>
    81   <li>Find the <code>ssh</code> commands for your client in your <code>readyToLogin</code> output.  Copy and paste those <code>ssh</code> commands directly into your terminal
    82 to log in.  You should see a shell prompt
    83 from the remote end:
    84 <pre>
    85      [example@client ~]$
    86 </pre>
    87 
    88 <table id="Table_01" border="0" cellpadding="5" cellspacing="0">
    89         <tr>
    90                 <td>
    91                         <img src="http://trac.gpolab.bbn.com/gcf/raw-attachment/wiki/Graphics/4NotesIcon_512x512.png" width="50" height="50" alt="Note">
    92                </td>
    93 <td>
    94 While you're welcome to inspect either
    95 the client or server, for the purpose of this experiment the <tt>client</tt> host is the one
    96 running the <tt>iperf</tt> tests and collecting all the logs.
    97 </td>
    98         </tr>
    99 </table>
    100 </li>
    101 <li>
    102 
    103 Look in the <code>/local</code> directory to see the install scripts and other files from the Archive you specified in your RSpec:
    104 <pre>
    105 cd /local
    106 ls
    107 </pre>
    108 </li>
    109 <li>
    110 Look for the <code>iperf</code> and <code>wget</code> processes started by your install scripts:
    111 <pre>
    112 ps ax
    113 </pre>
    114 
    115 <table id="Table_03" border="0" cellpadding="5" cellspacing="0">
    116         <tr>
    117                 <td>
    118                         <img src="http://groups.geni.net/geni/attachment/wiki/GENIExperimenter/Tutorials/Graphics/Symbols-Tips-icon.png?format=raw" width="50" height="50" alt="Tip">
    119                </td>
    120                <td>
    121 If you do not see the proper files and processes, please double-check the <code>RSpec</code> you used in the previous step.
    122         </tr>
    123 </table>
    124 
    125 </li>
    126 </ol>
    127        </td>
    128 
    129       </tr>
    130 <tr><td>
    131 </td></tr>
    132 </table>
    133 }}}
    134 
    135 
    136 
    137 The client machine is saving all the test results in the `/tmp/iperf-logs`
    138 directory.  Files with timestamps in the names will gradually appear
    139 there (there are 100 tests overall, and it may take 20 minutes for all
    140 of them to complete if you want to wait for them). 
    141 
    142 Each log file corresponds to one test with some number of simultaneous
    143 TCP connections over the VLAN link you requested between the two hosts.
    144 Later tests gradually include more concurrent connections, so the
    145 throughput of each individual connection will decrease, but the
    146 aggregate throughput (the `[SUM]` line at the end of each file)
    147 should remain approximately consistent.
    148 
     59=== A. Observe the properties of the network interfaces ===
     60
     61{{{
     62# /sbin/ifconfig
     63eth0      Link encap:Ethernet  HWaddr fa:16:3e:72:ad:a6 
     64          inet addr:10.103.0.20  Bcast:10.103.0.255  Mask:255.255.255.0
     65          inet6 addr: fe80::f816:3eff:fe72:ada6/64 Scope:Link
     66          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
     67          RX packets:1982 errors:0 dropped:0 overruns:0 frame:0
     68          TX packets:1246 errors:0 dropped:0 overruns:0 carrier:0
     69          collisions:0 txqueuelen:1000
     70          RX bytes:301066 (294.0 KiB)  TX bytes:140433 (137.1 KiB)
     71          Interrupt:11 Base address:0x2000
     72
     73eth1      Link encap:Ethernet  HWaddr fe:16:3e:00:6d:af 
     74          inet addr:172.16.1.1  Bcast:172.16.1.255  Mask:255.255.255.0
     75          inet6 addr: fe80::fc16:3eff:fe00:6daf/64 Scope:Link
     76          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
     77          RX packets:21704 errors:0 dropped:0 overruns:0 frame:0
     78          TX packets:4562 errors:0 dropped:0 overruns:0 carrier:0
     79          collisions:0 txqueuelen:1000
     80          RX bytes:3100262 (2.9 MiB)  TX bytes:824572 (805.2 KiB)
     81
     82lo        Link encap:Local Loopback 
     83          inet addr:127.0.0.1  Mask:255.0.0.0
     84          inet6 addr: ::1/128 Scope:Host
     85          UP LOOPBACK RUNNING  MTU:16436  Metric:1
     86          RX packets:19394 errors:0 dropped:0 overruns:0 frame:0
     87          TX packets:19394 errors:0 dropped:0 overruns:0 carrier:0
     88          collisions:0 txqueuelen:0
     89          RX bytes:4010954 (3.8 MiB)  TX bytes:4010954 (3.8 MiB)
     90}}}
     91 
     92
     93===  B. Observe the contents of the NEuca user data file.  This file includes a script that will install and execute the script that you configured for the VM. ===
     94{{{
     95# neuca-user-data
     96[global]
     97actor_id=67C4EFB4-7CBF-48C9-8195-934FF81434DC
     98slice_id=39672f6e-610a-4d86-8810-30e02d20cc99
     99reservation_id=55676541-5221-483d-bb60-429de025f275
     100unit_id=902709a4-32f2-41fc-b85c-b4791c779580
     101;router= Not Specified
     102;iscsi_initiator_iqn= Not Specified
     103slice_name=urn:publicid:IDN+ch.geni.net:ADAMANT+slice+pruth-winter-camp
     104unit_url=http://geni-orca.renci.org/owl/8210b4d7-4afc-4838-801f-c20a8f1f75ae#hadoop-master
     105host_name=hadoop-master
     106[interfaces]
     107fe163e006daf=up:ipv4:172.16.1.1/24
     108[storage]
     109[routes]
     110[scripts]
     111bootscript=#!/bin/bash
     112        # Automatically generated boot script
     113        # wget or curl must be installed on the image
     114        mkdir -p /tmp
     115        cd /tmp
     116        if [ -x `which wget 2>/dev/null` ]; then
     117          wget -q -O `basename http://geni-images.renci.org/images/GENIWinterCamp/master.sh` http://geni-images.renci.org/images/GENIWinterCamp/master.sh
     118        else if [ -x `which curl 2>/dev/null` ]; then
     119          curl http://geni-images.renci.org/images/GENIWinterCamp/master.sh > `basename http://geni-images.renci.org/images/GENIWinterCamp/master.sh`
     120        fi
     121        fi
     122        eval "/bin/sh -c \"chmod +x /tmp/master.sh; /tmp/master.sh\""
     123}}}
     124
     125
     126=== C. Observe the contents of the of the script that was installed and executed on the VM. ===
     127{{{
     128# cat /tmp/master.sh
     129#!/bin/bash
     130
     131 echo "Hello from neuca script" > /home/hadoop/log
     132 MY_HOSTNAME=hadoop-master
     133 hostname $MY_HOSTNAME
     134 echo 172.16.1.1  hadoop-master  >> /etc/hosts
     135  echo 172.16.1.10 hadoop-worker-0 >> /etc/hosts
     136  echo 172.16.1.11 hadoop-worker-1 >> /etc/hosts
     137  echo 172.16.1.12 hadoop-worker-2 >> /etc/hosts
     138  echo 172.16.1.13 hadoop-worker-3 >> /etc/hosts
     139  echo 172.16.1.14 hadoop-worker-4 >> /etc/hosts
     140  echo 172.16.1.15 hadoop-worker-5 >> /etc/hosts
     141  echo 172.16.1.16 hadoop-worker-6 >> /etc/hosts
     142  echo 172.16.1.17 hadoop-worker-7 >> /etc/hosts
     143  echo 172.16.1.18 hadoop-worker-8 >> /etc/hosts
     144  echo 172.16.1.19 hadoop-worker-9 >> /etc/hosts
     145  echo 172.16.1.20 hadoop-worker-10 >> /etc/hosts
     146  echo 172.16.1.21 hadoop-worker-11 >> /etc/hosts
     147  echo 172.16.1.22 hadoop-worker-12 >> /etc/hosts
     148  echo 172.16.1.23 hadoop-worker-13 >> /etc/hosts
     149  echo 172.16.1.24 hadoop-worker-14 >> /etc/hosts
     150  echo 172.16.1.25 hadoop-worker-15 >> /etc/hosts
     151  while true; do
     152      PING=`ping -c 1 172.16.1.1 > /dev/null 2>&1`
     153      if [ "$?" = "0" ]; then
     154          break
     155      fi
     156      sleep 5
     157  done
     158  echo '/home/hadoop/hadoop-euca-init.sh 172.16.1.1 -master' >> /home/hadoop/log
     159  /home/hadoop/hadoop-euca-init.sh 172.16.1.1 -master
     160  echo "Done starting daemons" >> /home/hadoop/log
     161}}}
     162
     163
     164=== D. Test for connectivity between the VMs. ===
     165
     166{{{
     167# ping hadoop-worker-0
     168PING hadoop-worker-0 (172.16.1.10) 56(84) bytes of data.
     16964 bytes from hadoop-worker-0 (172.16.1.10): icmp_req=1 ttl=64 time=0.747 ms
     17064 bytes from hadoop-worker-0 (172.16.1.10): icmp_req=2 ttl=64 time=0.459 ms
     17164 bytes from hadoop-worker-0 (172.16.1.10): icmp_req=3 ttl=64 time=0.411 ms
     172^C
     173--- hadoop-worker-0 ping statistics ---
     1743 packets transmitted, 3 received, 0% packet loss, time 1998ms
     175rtt min/avg/max/mdev = 0.411/0.539/0.747/0.148 ms
     176# ping hadoop-worker-1
     177PING hadoop-worker-1 (172.16.1.11) 56(84) bytes of data.
     17864 bytes from hadoop-worker-1 (172.16.1.11): icmp_req=1 ttl=64 time=0.852 ms
     17964 bytes from hadoop-worker-1 (172.16.1.11): icmp_req=2 ttl=64 time=0.468 ms
     18064 bytes from hadoop-worker-1 (172.16.1.11): icmp_req=3 ttl=64 time=0.502 ms
     181^C
     182--- hadoop-worker-1 ping statistics ---
     1833 packets transmitted, 3 received, 0% packet loss, time 1999ms
     184rtt min/avg/max/mdev = 0.468/0.607/0.852/0.174 ms
     185}}}
     186
     187== 3. Check the status of the Hadoop filesystem. ==
     188
     189===  A. Query for the status of the filesystem and its associated workers. ===
     190
     191{{{
     192# hadoop dfsadmin -report
     193Configured Capacity: 54958481408 (51.18 GB)
     194Present Capacity: 48681934878 (45.34 GB)
     195DFS Remaining: 48681885696 (45.34 GB)
     196DFS Used: 49182 (48.03 KB)
     197DFS Used%: 0%
     198Under replicated blocks: 1
     199Blocks with corrupt replicas: 0
     200Missing blocks: 0
     201
     202-------------------------------------------------
     203Datanodes available: 2 (2 total, 0 dead)
     204
     205Name: 172.16.1.11:50010
     206Rack: /default/rack0
     207Decommission Status : Normal
     208Configured Capacity: 27479240704 (25.59 GB)
     209DFS Used: 24591 (24.01 KB)
     210Non DFS Used: 3137957873 (2.92 GB)
     211DFS Remaining: 24341258240(22.67 GB)
     212DFS Used%: 0%
     213DFS Remaining%: 88.58%
     214Last contact: Sat Jan 04 21:49:32 UTC 2014
     215
     216
     217Name: 172.16.1.10:50010
     218Rack: /default/rack0
     219Decommission Status : Normal
     220Configured Capacity: 27479240704 (25.59 GB)
     221DFS Used: 24591 (24.01 KB)
     222Non DFS Used: 3138588657 (2.92 GB)
     223DFS Remaining: 24340627456(22.67 GB)
     224DFS Used%: 0%
     225DFS Remaining%: 88.58%
     226Last contact: Sat Jan 04 21:49:33 UTC 2014
     227}}}
     228
     229
     230
     231== 4. Test the filesystem with a small file ==
     232
     233
     234=== A. Create a small test file ===
     235{{{
     236# echo Hello GENI World > hello.txt
     237}}}
     238
     239=== B. Push the file into the Hadoop filesystem ===
     240{{{
     241# hadoop fs -put hello.txt hello.txt
     242}}}
     243
     244=== C. Check for the file's existence ===
     245{{{
     246# hadoop fs -ls
     247Found 1 items
     248-rw-r--r--   3 root supergroup         12 2014-01-04 21:59 /user/root/hello.txt
     249}}}
     250
     251===  D. Check the contents of the file ===
     252{{{
     253# hadoop fs -cat hello.txt
     254Hello GENI World
     255}}}
     256
     257== 4.   Run the Hadoop Sort Testcase ==
     258
     259 Test the true power of the Hadoop filesystem by creating and sorting a large random dataset.   It may be useful/interesting to login to the master and/or worker VMs and use tools like top, iotop, and iftop to observe the resource utilization on each of the VMs during the sort test.  Note: on these VMs iotop and iftop must be run as root.
     260
     261===  A. Create a 1 GB random data set.   ===
     262
     263After the data is created, use the ls functionally to confirm the data exists.  Note that the data is composed of several files in a directory.
     264
     265{{{
     266#  hadoop jar /usr/local/hadoop-0.20.2/hadoop-0.20.2-examples.jar teragen 10000000 random.data.1G
     267Generating 10000000 using 2 maps with step of 5000000
     26814/01/05 18:47:58 INFO mapred.JobClient: Running job: job_201401051828_0003
     26914/01/05 18:47:59 INFO mapred.JobClient:  map 0% reduce 0%
     27014/01/05 18:48:14 INFO mapred.JobClient:  map 35% reduce 0%
     27114/01/05 18:48:17 INFO mapred.JobClient:  map 57% reduce 0%
     27214/01/05 18:48:20 INFO mapred.JobClient:  map 80% reduce 0%
     27314/01/05 18:48:26 INFO mapred.JobClient:  map 100% reduce 0%
     27414/01/05 18:48:28 INFO mapred.JobClient: Job complete: job_201401051828_0003
     27514/01/05 18:48:28 INFO mapred.JobClient: Counters: 6
     27614/01/05 18:48:28 INFO mapred.JobClient:   Job Counters
     27714/01/05 18:48:28 INFO mapred.JobClient:     Launched map tasks=2
     27814/01/05 18:48:28 INFO mapred.JobClient:   FileSystemCounters
     27914/01/05 18:48:28 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
     28014/01/05 18:48:28 INFO mapred.JobClient:   Map-Reduce Framework
     28114/01/05 18:48:28 INFO mapred.JobClient:     Map input records=10000000
     28214/01/05 18:48:28 INFO mapred.JobClient:     Spilled Records=0
     28314/01/05 18:48:28 INFO mapred.JobClient:     Map input bytes=10000000
     28414/01/05 18:48:28 INFO mapred.JobClient:     Map output records=10000000
     285}}}
     286
     287=== B. Sort the dataset. === 
     288
     289Note: you can use Hadoop's cat and/or get  functionally to look at the random and sorted files to confirm their size and that the sort actually worked.
     290
     291{{{
     292# hadoop jar /usr/local/hadoop-0.20.2/hadoop-0.20.2-examples.jar terasort random.data.1G sorted.data.1G
     29314/01/05 18:50:49 INFO terasort.TeraSort: starting
     29414/01/05 18:50:49 INFO mapred.FileInputFormat: Total input paths to process : 2
     29514/01/05 18:50:50 INFO util.NativeCodeLoader: Loaded the native-hadoop library
     29614/01/05 18:50:50 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
     29714/01/05 18:50:50 INFO compress.CodecPool: Got brand-new compressor
     298Making 1 from 100000 records
     299Step size is 100000.0
     30014/01/05 18:50:50 INFO mapred.JobClient: Running job: job_201401051828_0004
     30114/01/05 18:50:51 INFO mapred.JobClient:  map 0% reduce 0%
     30214/01/05 18:51:05 INFO mapred.JobClient:  map 6% reduce 0%
     30314/01/05 18:51:08 INFO mapred.JobClient:  map 20% reduce 0%
     30414/01/05 18:51:11 INFO mapred.JobClient:  map 33% reduce 0%
     30514/01/05 18:51:14 INFO mapred.JobClient:  map 37% reduce 0%
     30614/01/05 18:51:29 INFO mapred.JobClient:  map 55% reduce 0%
     30714/01/05 18:51:32 INFO mapred.JobClient:  map 65% reduce 6%
     30814/01/05 18:51:35 INFO mapred.JobClient:  map 71% reduce 6%
     30914/01/05 18:51:38 INFO mapred.JobClient:  map 72% reduce 8%
     31014/01/05 18:51:44 INFO mapred.JobClient:  map 74% reduce 8%
     31114/01/05 18:51:47 INFO mapred.JobClient:  map 74% reduce 10%
     31214/01/05 18:51:50 INFO mapred.JobClient:  map 87% reduce 12%
     31314/01/05 18:51:53 INFO mapred.JobClient:  map 92% reduce 12%
     31414/01/05 18:51:56 INFO mapred.JobClient:  map 93% reduce 12%
     31514/01/05 18:52:02 INFO mapred.JobClient:  map 100% reduce 14%
     31614/01/05 18:52:05 INFO mapred.JobClient:  map 100% reduce 22%
     31714/01/05 18:52:08 INFO mapred.JobClient:  map 100% reduce 29%
     31814/01/05 18:52:14 INFO mapred.JobClient:  map 100% reduce 33%
     31914/01/05 18:52:23 INFO mapred.JobClient:  map 100% reduce 67%
     32014/01/05 18:52:26 INFO mapred.JobClient:  map 100% reduce 70%
     32114/01/05 18:52:29 INFO mapred.JobClient:  map 100% reduce 75%
     32214/01/05 18:52:32 INFO mapred.JobClient:  map 100% reduce 80%
     32314/01/05 18:52:35 INFO mapred.JobClient:  map 100% reduce 85%
     32414/01/05 18:52:38 INFO mapred.JobClient:  map 100% reduce 90%
     32514/01/05 18:52:46 INFO mapred.JobClient:  map 100% reduce 100%
     32614/01/05 18:52:48 INFO mapred.JobClient: Job complete: job_201401051828_0004
     32714/01/05 18:52:48 INFO mapred.JobClient: Counters: 18
     32814/01/05 18:52:48 INFO mapred.JobClient:   Job Counters
     32914/01/05 18:52:48 INFO mapred.JobClient:     Launched reduce tasks=1
     33014/01/05 18:52:48 INFO mapred.JobClient:     Launched map tasks=16
     33114/01/05 18:52:48 INFO mapred.JobClient:     Data-local map tasks=16
     33214/01/05 18:52:48 INFO mapred.JobClient:   FileSystemCounters
     33314/01/05 18:52:48 INFO mapred.JobClient:     FILE_BYTES_READ=2382257412
     33414/01/05 18:52:48 INFO mapred.JobClient:     HDFS_BYTES_READ=1000057358
     33514/01/05 18:52:48 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=3402255956
     33614/01/05 18:52:48 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=1000000000
     33714/01/05 18:52:48 INFO mapred.JobClient:   Map-Reduce Framework
     33814/01/05 18:52:48 INFO mapred.JobClient:     Reduce input groups=10000000
     33914/01/05 18:52:48 INFO mapred.JobClient:     Combine output records=0
     34014/01/05 18:52:48 INFO mapred.JobClient:     Map input records=10000000
     34114/01/05 18:52:48 INFO mapred.JobClient:     Reduce shuffle bytes=951549012
     34214/01/05 18:52:48 INFO mapred.JobClient:     Reduce output records=10000000
     34314/01/05 18:52:48 INFO mapred.JobClient:     Spilled Records=33355441
     34414/01/05 18:52:48 INFO mapred.JobClient:     Map output bytes=1000000000
     34514/01/05 18:52:48 INFO mapred.JobClient:     Map input bytes=1000000000
     34614/01/05 18:52:48 INFO mapred.JobClient:     Combine input records=0
     34714/01/05 18:52:48 INFO mapred.JobClient:     Map output records=10000000
     34814/01/05 18:52:48 INFO mapred.JobClient:     Reduce input records=10000000
     34914/01/05 18:52:48 INFO terasort.TeraSort: done
     350}}}
     351
     352
     353== 5.   Advanced Example ==
     354
     355 Re-do the tutorial with a different number of workers, amount of bandwidth, and/or worker  instance types.  Warning:  be courteous to  other users and do not use too many of the resources.
     356
     357=== A. Time the performance of runs with different resources.  ===
     358=== B. Observe largest size file you can create with different resources. ===
    149359
    150360