Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of saint.ack@gmail.com
 designates 209.85.216.169 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:sender:in-reply-to:references:date
         :x-google-sender-auth:message-id:subject:from:to:content-type
         :content-transfer-encoding;
        b=rg6CRe1lR7J9Z0r+uR1r+kkd/l4tKacDGIF5hAsbomF17i2rxr4+tkhN/O9jpwbDrW
         NJEIZIt1OYHThTODVyDvqLecByIXS1l0oCWP7idUz2jK6cLqD9c2qzHsuzegdI38GDUf
         4NeONZMWGsB+SYjme9N52w4htsxsdGmMqTCjA=
MIME-Version: 1.0
Sender: saint.ack@gmail.com
In-Reply-To: <C8734F80.A5B5%vidhyash@yahoo-inc.com>
References: <C8734F80.A5B5%vidhyash@yahoo-inc.com>
Date: Mon, 26 Jul 2010 23:23:03 -0700
Message-ID: <AANLkTinsqJMSChZa7RR5oh8MWxWjGFXqW=kVCW5oohfR@mail.gmail.com>
Subject: Re: MR sharded Scans giving poor performance..
From: Stack <stack@duboce.net>
To: user@hbase.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Jul 26, 2010 at 2:43 PM, Vidhyashankar Venkataraman
<vidhyash@yahoo-inc.com> wrote:
> I am trying to assess the performance of Scans on a 100TB db on 180 nodes=
 running Hbase 0.20.5..
>
> I run a sharded scan (each Map task runs a scan on a specific range: spec=
ulative execution is turned false so that there is no duplication in tasks)=
 on a fully compacted table...
>

How big is the range V?  How many rows you scan in your map task?
They are contiguous, right?


> 1 MB block size, Block cache enabled.. Max of 2 tasks per node.. =A0Each =
row is 30 KB in size: 1 big column family with just one field..
> Region lease timeout is set to an hour.. And I don't get any socket timeo=
ut exceptions so I have not reassigned the write socket timeout...
>


Did you try with defaults first?


> I ran experiments on the following cases:
>
> =A01. =A0The client level cache is set to 1 (default: got he number using=
 getCaching): The MR tasks take around 13 hours to finish in the average.. =
Which gives around 13.17 MBps per node. The worst case is 34 hours (to fini=
sh the entire job)...
> =A02. =A0Client cache set to 20 rows: this is much worse than the previou=
s case: we get around a super low 1MBps per node...
>
> =A0 =A0 =A0 =A0 Question: Should I set it to a value such that the block =
size is a multiple of the above said cache size? Or the cache size to a muc=
h lower value?
>
> I find that these numbers are much less than the ones I get when it's run=
ning with just a few nodes..
>


What numbers you see on a smaller cluster?

> Oh and forgot to add, 4 gig regions and 8 gig heap size..

So 4G to HBase and 8G on these machines in total?  You are running
TaskTrackers on same machines?   2 Mappers, 1 DN, and 1RS on all 180
machines?

You are using Hadoop streaming?  Hows that work?  Streaming does text
only?  I didn't think you could write HBase out of Streaming.

> does the Hfile block size influence only the size of the index and the ef=
ficiency of random reads?

Generally, yes.  I'd think though that a bigger block size, especially
if you are using caching so you cut down on number of RPCs, then you
should be approaching the scan speeds you'd see going against HDFS.

>  Just to make sure: the client uses zookeeper only for obtaining ROOT rig=
ht whenever it performs scans, isnt it? So scans shouldn't face any master/=
zk bottlenecks when we scale up wrt number of nodes, am I right?

Thats right.

St.Ack