Return-Path: Delivered-To: apmail-hbase-user-archive@www.apache.org Received: (qmail 51736 invoked from network); 14 Mar 2011 14:35:08 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Mar 2011 14:35:08 -0000 Received: (qmail 77534 invoked by uid 500); 14 Mar 2011 14:35:07 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 77483 invoked by uid 500); 14 Mar 2011 14:35:07 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 77475 invoked by uid 99); 14 Mar 2011 14:35:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 14:35:07 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of alex.baranov.v@gmail.com designates 209.85.216.41 as permitted sender) Received: from [209.85.216.41] (HELO mail-qw0-f41.google.com) (209.85.216.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 14:35:02 +0000 Received: by qwa26 with SMTP id 26so1771153qwa.14 for ; Mon, 14 Mar 2011 07:34:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=wzH2sDUNsAGds3IW77YdD4VFU2VhvIItTU49g3pNnTs=; b=fGuOrVvYl2c+71TFlwkWAYcRJIEeiguE/kK1GkmeFhw3QBlTRJfYUY3TwOXSGLrlb/ KV3Vmnvegkxvj1dWjqVifI61kAGWdyCxvdx+HXOpqglaXtRFy4/AG500qxxJA3Kx9+fr tc0//MNrD1l1eTzX4eNdkiemxMf2unCmj46Ws= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=i+ryXRTGwsyLiI20jr77xwPkiGa7pA8haRvtfLcUcIq26lq3agEFE12yagLnGblmdV a1GV2EPRF5n/eU6LZRm/Cz8vgiCGb3FskPSctgG8LGNCMIpXudUmokeJ2z0uLX0cflrr +PLJrdJ/mD6ZchqfyeE8O8eJp4cmjbRFIrGFg= MIME-Version: 1.0 Received: by 10.229.78.22 with SMTP id i22mr5856456qck.28.1300113280676; Mon, 14 Mar 2011 07:34:40 -0700 (PDT) Received: by 10.229.1.8 with HTTP; Mon, 14 Mar 2011 07:34:40 -0700 (PDT) Date: Mon, 14 Mar 2011 16:34:40 +0200 Message-ID: Subject: hadoop fs -du & hbase table size From: Alex Baranau To: user@hbase.apache.org, hdfs-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00235429d8f4433c45049e723690 --00235429d8f4433c45049e723690 Content-Type: text/plain; charset=ISO-8859-1 Hello, As far as I understand, since "hadoop fs -du" command uses Linux' "du" internally this mean that the number of replicas (at the moment of command run) affect the result. Is that correct? I have the following case. I have a small (1 master + 5 slaves each with DN, TT & RS) test HBase cluster with replication set to 2. The tables data size is monitoried with the help of "hadoop fs -du" command. There's a table which is constantly written to: data is only added in it. At some point I decided to reconfigure one of the slaves and shut it down. After reconfiguration (HBase already marked it as dead one) I brought it up again. Things went smoothly. However on the table size graph (I drew from data fetched with "hadoop fs -du" command) I noticed a little spike up on data size and then it went down to the normal/expected values. Can it be so that at some point of the taking out/reconfiguring/adding back node procedure at some point blocks were over-replicated? I'd expect them to be under-replicated for some time (as DN is down) and I'd expect to see the inverted spike: small decrease in data amount and then back to "expected" rate (after all blocks got replicated again). Any ideas? Thank you, Alex Baranau ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase --00235429d8f4433c45049e723690--