人気ブログランキング | 話題のタグを見る

Metaphorical Dream

Hadoop #1

何とかwordcountしてくれるところまでは、動作できたw

さて、1GBのファイルをどうやって作るかね。

別にjavaエンジニアになるつもりは無いんだけど、
本当に速いのか?この目で確かめてぇ~。

っていうか、月曜日からこのペースはちとキツイ。


[hadoop@CentOS5 hadoop]$ cat > input/file3
tempest tempest ludwig ludwig ludwig wolfgang wolfgang wolfgang wolfgang

[hadoop@CentOS5 hadoop]$ ./bin/hadoop dfs -copyFromLocal input inputs

[hadoop@CentOS5 hadoop]$ ./bin/hadoop jar hadoop-0.20.1-examples.jar wordcount inputs outputs
09/11/17 01:34:39 INFO input.FileInputFormat: Total input paths to process : 1
09/11/17 01:34:39 INFO mapred.JobClient: Running job: job_200911170127_0002
09/11/17 01:34:40 INFO mapred.JobClient: map 0% reduce 0%
09/11/17 01:34:49 INFO mapred.JobClient: map 100% reduce 0%
09/11/17 01:35:01 INFO mapred.JobClient: map 100% reduce 100%
09/11/17 01:35:03 INFO mapred.JobClient: Job complete: job_200911170127_0002
09/11/17 01:35:03 INFO mapred.JobClient: Counters: 17
09/11/17 01:35:03 INFO mapred.JobClient: Job Counters
09/11/17 01:35:03 INFO mapred.JobClient: Launched reduce tasks=1
09/11/17 01:35:03 INFO mapred.JobClient: Launched map tasks=1
09/11/17 01:35:03 INFO mapred.JobClient: Data-local map tasks=1
09/11/17 01:35:03 INFO mapred.JobClient: FileSystemCounters
09/11/17 01:35:03 INFO mapred.JobClient: FILE_BYTES_READ=48
09/11/17 01:35:03 INFO mapred.JobClient: HDFS_BYTES_READ=73
09/11/17 01:35:03 INFO mapred.JobClient: FILE_BYTES_WRITTEN=128
09/11/17 01:35:03 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=30
09/11/17 01:35:03 INFO mapred.JobClient: Map-Reduce Framework
09/11/17 01:35:03 INFO mapred.JobClient: Reduce input groups=0
09/11/17 01:35:03 INFO mapred.JobClient: Combine output records=3
09/11/17 01:35:03 INFO mapred.JobClient: Map input records=1
09/11/17 01:35:03 INFO mapred.JobClient: Reduce shuffle bytes=48
09/11/17 01:35:03 INFO mapred.JobClient: Reduce output records=0
09/11/17 01:35:03 INFO mapred.JobClient: Spilled Records=6
09/11/17 01:35:03 INFO mapred.JobClient: Map output bytes=109
09/11/17 01:35:03 INFO mapred.JobClient: Combine input records=9
09/11/17 01:35:03 INFO mapred.JobClient: Map output records=9
09/11/17 01:35:03 INFO mapred.JobClient: Reduce input records=3

[hadoop@CentOS5 hadoop]$ ./bin/hadoop dfs -ls outputs
Found 2 items
drwxr-xr-x - hadoop supergroup 0 2009-11-17 01:34 /user/hadoop/outputs/_logs
-rw-r--r-- 3 hadoop supergroup 30 2009-11-17 01:34 /user/hadoop/outputs/part-r-00000

[hadoop@CentOS5 hadoop]$ ./bin/hadoop dfs -cat outputs/part-r-00000
ludwig 3
tempest 2
wolfgang 4

by mdesign21 | 2009-11-17 01:44 | IT系