Changes between Version 6 and Version 7 of GENIExperimenter/Tutorials/jacks/HadoopInASlice/ExecuteExperiment
- Timestamp:
- 09/17/15 09:20:11 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GENIExperimenter/Tutorials/jacks/HadoopInASlice/ExecuteExperiment
v6 v7 235 235 a. Create a 1 GB random data set 236 236 {{{ 237 # hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar teragen 10000000 /input 237 # hadoop jar \ 238 /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \ 239 teragen 10000000 /input 238 240 14/01/05 18:47:58 INFO mapred.JobClient: Running job: job_201401051828_0003 239 241 14/01/05 18:47:59 INFO mapred.JobClient: map 0% reduce 0% … … 257 259 a. Sort the dataset: 258 260 {{{ 259 # hadoop jar /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar terasort /input /output 261 # hadoop jar \ 262 /home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar \ 263 terasort /input /output 264 260 265 14/01/05 18:50:49 INFO terasort.TeraSort: starting 261 266 14/01/05 18:50:49 INFO mapred.FileInputFormat: Total input paths to process : 2 … … 319 324 Try some or all of these commands. Does the output make sense to you? 320 325 {{{ 321 hadoop fs -ls random.data.1G 322 hadoop fs -ls sorted.data.1G 323 hadoop fs -cat random.data.1G/part-00000 | less 324 hadoop fs -cat sorted.data.1G/part-00000 | less 325 }}} 326 326 hdfs dfs -ls /input/ 327 hdfs dfs -ls /output/ 328 hdfs dfs -cat /input/part-m-00000 | less 329 hdfs dfs -cat /output/part-r-00000 } less 330 }}} 331 a. Use hexdump to see the sorted file. Because the files are binary, it is hard to see the sorted output in ascii. Use hexdump: 332 {{{ 333 hdfs dfs -get /output/part-r-00000 /tmp/part-r-00000 334 }}} 335 {{{ 336 hexdump /tmp/part-r-00000 | less 337 }}} 327 338 == 3. Advanced Example == 328 339