Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F7C044F1 for ; Wed, 11 May 2011 21:03:05 +0000 (UTC) Received: (qmail 4530 invoked by uid 500); 11 May 2011 21:03:04 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 4489 invoked by uid 500); 11 May 2011 21:03:04 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 4481 invoked by uid 99); 11 May 2011 21:03:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2011 21:03:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jwfbean@cloudera.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2011 21:02:56 +0000 Received: by wwi18 with SMTP id 18so1087813wwi.29 for ; Wed, 11 May 2011 14:02:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.67.199 with SMTP id j49mr945770wed.59.1305147756292; Wed, 11 May 2011 14:02:36 -0700 (PDT) Received: by 10.216.74.5 with HTTP; Wed, 11 May 2011 14:02:36 -0700 (PDT) In-Reply-To: References: Date: Wed, 11 May 2011 14:02:36 -0700 Message-ID: Subject: Re: Any fix for this? From: Jeff Bean To: hdfs-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cdffe4464aa9904a306642b X-Virus-Checked: Checked by ClamAV on apache.org --000e0cdffe4464aa9904a306642b Content-Type: text/plain; charset=ISO-8859-1 If I understand correctly, datanode reports its blocks based on the contents of dfs.data.dir. When you cloned the data node, you cloned all of its blocks as well. When you add a "fresh" datanode to the cluster, you add one that has an empty dfs.data.dir. Try clearing out dfs.data.dir before adding the new node. Jeff On Wed, May 11, 2011 at 1:59 PM, Steve Cohen wrote: > Hello, > > We are running an hdfs cluster and we decided we wanted to add a new > datanode. Since we are using a virtual machine, we just cloned an existing > datanode. We added it to the slaves list and started up the cluster. We > started getting log messages like this in the namenode log: > > 2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* > NameSystem.getDatanode: Data node 10.104.211.58:50010 is attempting to > report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node > 10.104.211.57:50010 is expected to serve this storage. > 2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* > NameSystem.getDatanode: Data node 10.104.211.57:50010 is attempting to > report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node > 10.104.211.58:50010 is expected to serve this storage. > > I understand that this is because the datanodes have the exact same > information so the first data node that connects has precedence. > > Is it possible to just wipe one of the datanodes so it is blank or do we > have to format the entire hdfs filesystem from the namenode to add the new > datanode. > > Thanks, > Steve Cohen > --000e0cdffe4464aa9904a306642b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable If I understand correctly, datanode reports its blocks based on the content= s of dfs.data.dir.

When you cloned the data node, you cloned all of = its blocks as well.

When you add a "fresh" datanode to the= cluster, you add one that has an empty dfs.data.dir.

Try clearing out dfs.data.dir before adding the new node.

Jeff

On Wed, May 11, 2011 at 1:59 PM, Steve= Cohen <mail4s= teve@gmail.com> wrote:
Hello,

We = are running an hdfs cluster and we decided we wanted to add a new datanode.= Since we are using a virtual machine, we just cloned an existing datanode.= We added it to the slaves list and started up the cluster. We started gett= ing log messages like this in the namenode log:

2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* NameSystem.getD= atanode: Data node 10.104.211.58:50010 is attempting to report storage ID DS-1360904153-1= 0.104.211.57-50010-1293288346692. Node 10.104.211.57:50010 is expected to serve this stor= age.
2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* NameSystem.getDatan= ode: Data node 10.= 104.211.57:50010 is attempting to report storage ID DS-1360904153-10.10= 4.211.57-50010-1293288346692. Node 10.104.211.58:50010 is expected to serve this storage.=

I understand that this is because the datanodes have the exact same inf= ormation so the first data node that connects has precedence.

Is it = possible to just wipe one of the datanodes so it is blank or do we have to = format the entire hdfs filesystem from the namenode to add the new datanode= .

Thanks,
Steve Cohen

--000e0cdffe4464aa9904a306642b--