Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 27955 invoked from network); 25 Mar 2009 12:34:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Mar 2009 12:34:53 -0000 Received: (qmail 58990 invoked by uid 500); 25 Mar 2009 12:34:50 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 58908 invoked by uid 500); 25 Mar 2009 12:34:49 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 58889 invoked by uid 99); 25 Mar 2009 12:34:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2009 12:34:49 +0000 X-ASF-Spam-Status: No, hits=3.7 required=10.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zsongbo@gmail.com designates 209.85.146.179 as permitted sender) Received: from [209.85.146.179] (HELO wa-out-1112.google.com) (209.85.146.179) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Mar 2009 12:34:39 +0000 Received: by wa-out-1112.google.com with SMTP id j32so4104waf.29 for ; Wed, 25 Mar 2009 05:34:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=V2RIECfsJgeU2cSj+HyQL5nYJPAXul4nNDvRFOIFeQA=; b=k3WbDAgt7gnB3ejN+4EOeknXflZXHeFvIfBnfKnukqEaJvOerTVCMVIaWejQBV+9AU CpkJCFRhVIe1hGK591QLY9IwTBf2m532rTyk6OBgznS5Bbzu8SjkTm+p+LGwQsapNXT8 UvGsi80ehzFdwnNSURXBncde2UoehFZASl8Ng= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=hWYj6JpOt9IPX6C3VMer7AFHyI/29Xg5dB9UsVFHProrSps8mwk5GSTPRc9zyD+138 HvqbZNWMxKdsfSKCY+akWqYfnVu2pPNpI/kDiGqKtwygKyZ05Nl5z2fMEj8vzZbUqPI6 RGjUrnOdfBf8P879+UB+tySzFAsqeDGBhb1Qk= MIME-Version: 1.0 Received: by 10.114.106.13 with SMTP id e13mr6458516wac.128.1237984457910; Wed, 25 Mar 2009 05:34:17 -0700 (PDT) In-Reply-To: References: <78568af10903250214t1cd1cba2q67ee314a27101624@mail.gmail.com> Date: Wed, 25 Mar 2009 20:34:17 +0800 Message-ID: Subject: Re: HDFS unbalance issue. (HBase over HDFS) From: schubert zhang To: hbase-user@hadoop.apache.org Cc: core-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00163645730ad9f07f0465f0b706 X-Virus-Checked: Checked by ClamAV on apache.org --00163645730ad9f07f0465f0b706 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Then, I stop my application (the application write to and read from HBase). After one hour, when I come back to see the status of HDFS, some blocks are deleted. Following is current status. [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" hadoop-schubert-namenode-nd0-rack0-cloud.log 2956 [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" hadoop-schubert-namenode-nd0-rack0-cloud.log 2962 node1: 464518 node2: 42495 node3: 7505 node4: 7205 node5: 7636 On each node, the datanode process is busy (top). I want to know the reason of these phenomenons. Thanks. Schubert On Wed, Mar 25, 2009 at 6:37 PM, schubert zhang wrote: > From another point of view, I think HBase cannot control to delete blocks > on which node, it would just delete files, and HDFS delete blocks where the > blocks locating. > > Schubert > > On Wed, Mar 25, 2009 at 6:28 PM, schubert zhang wrote: > >> Thanks Ryan. Balancer may take a long time. >> >> The number of block are too different. But maybe it is caused by HBase not >> deleting garbage blocks on regionserver1 and regionserver2 and maybe others. >> >> We grep the logs of hadoop and find there is no any "deleting block" in >> node1 and node2. >> >> Following is the grep (grep -c "ask 10.24.1.1?:50010 to delete") result of >> hasoop logs: >> >> namenode: >> >> -----grep -c "ask 10.24.1.12:50010 to delete"-----node1 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 4754 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 1062 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.12:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 0 >> >> -----grep -c "ask 10.24.1.14:50010 to delete"-----node2 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 1494 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 3305 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 3385 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.14:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 1494 >> >> -----grep -c "ask 10.24.1.16:50010 to delete"-----node3 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 8022 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 8238 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.16:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 4302 >> >> -----grep -c "ask 10.24.1.18:50010 to delete"-----node4 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 8591 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 9111 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.18:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 5038 >> >> -----grep -c "ask 10.24.1.20:50010 to delete"-----node5 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-23 >> 3794 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log.2009-03-24 >> 3946 >> [schubert@nd0-rack0-cloud logs]$ grep -c "ask 10.24.1.20:50010 to delete" >> hadoop-schubert-namenode-nd0-rack0-cloud.log >> 2989 >> >> So, I think it may caused by HBase. >> I just grep the log of the zero "delete block" node. and find: >> [schubert@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log.2009-03-24 >> 104739 >> [schubert@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log.2009-03-23 >> 465927 >> [schubert@nd1-rack0-cloud logs]$ grep -c "Deleting block" >> hadoop-schubert-datanode-nd1-rack0-cloud.log >> 0 >> >> >> >> >> On Wed, Mar 25, 2009 at 5:14 PM, Ryan Rawson wrote: >> >>> Try >>> hadoop/bin/start-balancer.sh >>> >>> HDFS doesnt auto-balance. Balancing in HDFS requires moving data around, >>> whereas balancing in HBase just means opening a file on a different >>> machine. >>> >>> On Wed, Mar 25, 2009 at 2:12 AM, schubert zhang >>> wrote: >>> >>> > Hi all, >>> > I am using hbase-0.19.1 and hadoop-0.19. >>> > My cluster have 5+1 nodes, and there are about 512 regions in HBase >>> (256MB >>> > per region). >>> > >>> > But I found the blocks in HDFS is very unbalanced. Following is the >>> status >>> > from HDFS web GUI. >>> > >>> > (Node: I don't know if this mailing list can display html!) >>> > >>> > HDFS blocks: >>> > node1 509036 blocks >>> > node2 157937 blocks >>> > node3 15783 blocks >>> > node4 15117 blocks >>> > node5 20158 blocks >>> > >>> > But my HBase regions are very balanced. >>> > node1 88 regions >>> > node2 108 regions >>> > node3 111 regions >>> > node4 102 regions >>> > node5 105 regions >>> > >>> > >>> > >>> > NodeLast >>> > ContactAdmin StateConfigured >>> > Capacity (GB)Used >>> > (GB)Non DFS >>> > Used (GB)Remaining >>> > (GB)Used >>> > (%)Used >>> > (%)Remaining >>> > (%)Blocksnd1-rack0-cloud< >>> > >>> http://nd1-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.8578.6743.28200.8670.3324.41509036nd2-rack0-cloud< >>> > >>> http://nd2-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.8190.0242.96589.8223.0971.68157937nd3-rack0-cloud< >>> > >>> http://nd3-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 0In Service822.851.9542.61728.246.3188.5115783nd4-rack0-cloud< >>> > >>> http://nd4-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 6In Service822.846.1942.84733.775.6189.1815117nd5-rack0-cloud< >>> > >>> http://nd5-rack0-cloud:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=%2F >>> > > >>> > 1In Service1215.6152.3762.911100.324.3190.5220158 >>> > >>> > >>> > But my HBase regions are very balanced. >>> > >>> > AddressStart CodeLoadnd1-rack0-cloud:60020 < >>> http://nd1-rack0-cloud:60030/> >>> > 1237967027050requests=383, regions=88, usedHeap=978, maxHeap=1991 >>> > nd2-rack0-cloud:60020 >> > >1237788871362requests=422, >>> > regions=108, usedHeap=1433, >>> > maxHeap=1991nd3-rack0-cloud:60020 >>> > 1237788881667requests=962, regions=111, usedHeap=1534, maxHeap=1991 >>> > nd4-rack0-cloud:60020 >> > >1237788859541requests=369, >>> > regions=102, usedHeap=1059, >>> > maxHeap=1991nd5-rack0-cloud:60020 >>> > 1237788899331requests=384, regions=105, usedHeap=1535, >>> > maxHeap=1983Total:servers: >>> > 5 requests=2520, regions=514 >>> > >>> >> >> > --00163645730ad9f07f0465f0b706--