Context Navigation

ExecuteExperiment

Timestamp:: 09/17/15 09:20:11 (9 years ago)
Author:: nriga@bbn.com
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GENIExperimenter/Tutorials/jacks/HadoopInASlice/ExecuteExperiment

-                      v6
+                      v7
  a. Create a 1 GB random data set
  {{{
+#  hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar teragen 10000000 /input
+#  hadoop jar \
+/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \
+teragen 10000000 /input
 /01/05 18:47:58 INFO mapred.JobClient: Running job: job_201401051828_0003
 /01/05 18:47:59 INFO mapred.JobClient:  map 0% reduce 0%
 …
  a. Sort the dataset:
  {{{
+# hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar terasort /input /output
+# hadoop jar \
+/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \
+terasort  /input /output
 /01/05 18:50:49 INFO terasort.TeraSort: starting
 /01/05 18:50:49 INFO mapred.FileInputFormat: Total input paths to process : 2
 …
  Try some or all of these commands.  Does the output make sense to you?
  {{{
+hadoop fs -ls random.data.1G
+hadoop fs -ls sorted.data.1G
+hadoop fs -cat random.data.1G/part-00000 | less
+hadoop fs -cat sorted.data.1G/part-00000 | less
+}}}
+hdfs dfs -ls /input/
+hdfs dfs -ls /output/
+hdfs dfs -cat /input/part-m-00000 | less
+hdfs dfs -cat /output/part-r-00000 } less
+}}}
+ a. Use hexdump to see the sorted file. Because the files are binary, it is hard to see the sorted output in ascii. Use hexdump:
+ {{{
+hdfs dfs -get /output/part-r-00000 /tmp/part-r-00000
+}}}
+ {{{
+hexdump /tmp/part-r-00000 | less
+}}}
 == 3.   Advanced Example ==