怎么样在Ubuntu上使用Hadoop?
怎么样在Ubuntu上使用Hadoop?
namenode管理站点
首先namenode有一个web站点,默认端口号是50070, 下面是我的截屏:
至少说明namenode服务启动正常了。
日志
网站上Utilities->Log里面可以看到namenode的日志信息。包括启动的时候Java的版本,参数等等。
也可以看到复制文件t.txt的操作:
2014-03-10 02:43:59,676 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /test/t.txt._COPYING_. BP-186292269-192.168.1.71-1394367313744 blk_1073741825_1001{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-4255c8d4-3ca1-4c5a-8184-83310513fb73:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-f988d67f-cdb2-4d57-bd72-0e652527f022:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-5f64aac5-bbff-480a-80b8-fee21719cd31:NORMAL|RBW]]} 2014-03-10 02:44:00,147 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.1.75:50010 is added to blk_1073741825_1001{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-4255c8d4-3ca1-4c5a-8184-83310513fb73:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-f988d67f-cdb2-4d57-bd72-0e652527f022:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-5f64aac5-bbff-480a-80b8-fee21719cd31:NORMAL|RBW]]} size 0 2014-03-10 02:44:00,150 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.1.73:50010 is added to blk_1073741825_1001{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-4255c8d4-3ca1-4c5a-8184-83310513fb73:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-f988d67f-cdb2-4d57-bd72-0e652527f022:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-5f64aac5-bbff-480a-80b8-fee21719cd31:NORMAL|RBW]]} size 0 2014-03-10 02:44:00,153 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 192.168.1.74:50010 is added to blk_1073741825_1001{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-4255c8d4-3ca1-4c5a-8184-83310513fb73:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-f988d67f-cdb2-4d57-bd72-0e652527f022:NORMAL|RBW], ReplicaUnderConstruction[[DISK]DS-5f64aac5-bbff-480a-80b8-fee21719cd31:NORMAL|RBW]]} size 0
为什么只看到datanode1?
原因是datanode2和datanode3中的/etc/hosts中忘记添加配置,加完后再次启动服务,很快网站上看到3个datanode了。
File System Shell
直接在namenode server上运行下面的命令:
hduser@namenode:~$ hdfs dfs -ls hdfs://namenode:9000/ Found 1 items drwxr-xr-x - hduser supergroup 0 2014-03-10 02:30 hdfs://namenode:9000/test
可以看到,已经成功创建了test目录。
官方文档参考:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
查找帮助的方式
hdfs --help 可以查看总的命令
查看dfs下面具体的命令用法,看下面的例子:
hduser@namenode:~$ hdfs dfs -help cp -cp [-f] [-p] <src> ... <dst>:Copy files that match the file pattern <src> to a destination. When copying multiple files, the destination must be a directory. Passing -p preserves access and modification times, ownership and the mode. Passing -f overwrites the destination if it already exists.复制本地文件到hdfs中,注意src要用file:///前缀
hdfs dfs -cp file:///home/hduser/t.txt hdfs://namenode:9000/test/这里t.txt是一个本地文件。
hduser@namenode:~$ hdfs dfs -ls hdfs://namenode:9000/test Found 1 items -rw-r--r-- 3 hduser supergroup 4 2014-03-10 02:44 hdfs://namenode:9000/test/t.txt