Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC7F01052F for ; Fri, 26 Jul 2013 05:49:16 +0000 (UTC) Received: (qmail 59112 invoked by uid 500); 26 Jul 2013 05:49:11 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 58866 invoked by uid 500); 26 Jul 2013 05:49:11 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58859 invoked by uid 99); 26 Jul 2013 05:49:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jul 2013 05:49:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.219.53 as permitted sender) Received: from [209.85.219.53] (HELO mail-oa0-f53.google.com) (209.85.219.53) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Jul 2013 05:49:05 +0000 Received: by mail-oa0-f53.google.com with SMTP id k14so6364373oag.26 for ; Thu, 25 Jul 2013 22:48:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=9W5SYoYUEb8yZ/hED2hMVyzhN4D4SLaeUlowuN1gdcE=; b=h4VP/snffWumw/DcI4rwuBa9Zd7xE3+m18VFwA4OwDfenZYFVePlepk+NkeynEMIqg TqOtZn3pD95y94gRPXFUBMSkGLREt/wWGmN5cRd/OJaXMk5LHs5IYLFzYMMvLmky3jgc DBiippKHyVT8q531pgT6R9CmLzNs7PAG2nJNFQnK40PDqfa7ATPGjZhLIkCEZcP3n/ew LKUTLfx8hl0hb0chZ/p/Mx7b52Sq8YWgxbHJop3h/IJrP39qMNToYVRtWFH4sr/CXJMT dVwi+wRpCfcT9LTRk8TY6J9QUx5emt3n5aOLWR7c43j97J/HLHZHYkw7VqsqEpyqxNr+ l/Fg== X-Received: by 10.50.153.109 with SMTP id vf13mr804268igb.58.1374817724617; Thu, 25 Jul 2013 22:48:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.87.164 with HTTP; Thu, 25 Jul 2013 22:48:24 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Fri, 26 Jul 2013 11:18:24 +0530 Message-ID: Subject: Re: problem about starting datanode To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQng0qo+mZpxhwobL1aIT7Cb8/LCCcs9Y9vm7alutoChAFATE/gEkIEbj6Z1J3v8yFmBUOjN X-Virus-Checked: Checked by ClamAV on apache.org You have one datanode data volume (dfs.datanode.data.dir) configured, but you've specified tolerated failures as 3. The state of 3 > 1 is invalid and hence the error. You cannot enable disk failure toleration of a DN with just one volume, so remove the toleration config and your problem will be resolved. On Fri, Jul 26, 2013 at 11:01 AM, ch huang wrote: > i config name node HA,but when i start data node ,i found the error info in > log > and here is my hdfs-site.xml file > > > > > dfs.permissions.superusergroup > hadoop > > > > dfs.datanode.data.dir > /data/hadoopdataspace > > > dfs.datanode.failed.volumes.tolerated > 3 > > > > dfs.nameservices > mycluster > > > dfs.ha.namenodes.mycluster > nn1,nn2 > > > dfs.namenode.rpc-address.mycluster.nn1 > node1:8020 > > > dfs.namenode.rpc-address.mycluster.nn2 > node2:8020 > > > dfs.namenode.http-address.mycluster.nn1 > node1:50070 > > > dfs.namenode.http-address.mycluster.nn2 > node2:50070 > > > dfs.namenode.shared.edits.dir > qjournal://node1:8485;node2:8485;node3:8485/mycluster > > > dfs.journalnode.edits.dir > /data/1/dfs/jn > > > dfs.client.failover.proxy.provider.mycluster > > org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider > > > > > dfs.ha.fencing.methods > sshfence > > > dfs.ha.fencing.ssh.private-key-files > /home/nodefence/.ssh/id_rsa > > > dfs.ha.fencing.ssh.connect-timeout > 30000 > > SSH connection timeout, in milliseconds, to use with the builtin > sshfence fencer. > > > > > dfs.webhdfs.enabled > true > > > > > > 2013-07-26 21:20:18,850 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Setting up storage: > nsid=291409768;bpid=BP-771660648-192.168.142.129-1374837820241;lv=-40;nsInfo=lv > =-40;cid=CID-28365f0e-e4f1-45b0-a86a-bb37794b6672;nsid=291409768;c=0;bpid=BP-771660648-192.168.142.129-1374837820241 > 2013-07-26 21:20:18,870 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool > BP-771660648-192.168.142.129-1374837820241 (storage id > DS-713465905-192.168.142.131-5001 > 0-1374844418641) service to node1/192.168.142.129:8020 beginning handshake > with NN > 2013-07-26 21:20:18,873 FATAL > org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for > block pool Block pool BP-771660648-192.168.142.129-1374837820241 (storag > e id DS-713465905-192.168.142.131-50010-1374844418641) service to > node2/192.168.142.130:8020 > org.apache.hadoop.util.DiskChecker$DiskErrorException: Invalid volume > failure config value: 3 > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.(FsDatasetImpl.java:183) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:920) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:882) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:308) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) > at java.lang.Thread.run(Thread.java:722) > 2013-07-26 21:20:18,874 WARN > org.apache.hadoop.hdfs.server.datanode.DataNode: En > ding block pool service for: Block pool > BP-771660648-192.168.142.129-1374837820241 (storage id > DS-713465905-192.168.142.131-50010-1374844418641) service to > node2/192.168.142.130:8020 > 2013-07-26 21:20:18,886 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool Block pool > BP-771660648-192.168.142.129-1374837820241 (storage id > DS-713465905-192.168.142.131-50010-1374844418641) service to > node1/192.168.142.129:8020 successfully registered with NN > 2013-07-26 21:20:18,887 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: For namenode > node1/192.168.142.129:8020 using DELETEREPORT_INTERVAL of 300000 msec > BLOCKREPORT_INTERVAL of 21600000msec Initial delay: 0msec; > heartBeatInterval=3000 > 2013-07-26 21:20:18,887 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService > for Block pool BP-771660648-192.168.142.129-1374837820241 (storage id > DS-713465905-192.168.142.131-50010-1374844418641) service to > node1/192.168.142.129:8020 > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:435) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:521) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:673) > at java.lang.Thread.run(Thread.java:722) > ~ -- Harsh J