hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray" <jl...@streamy.com>
Subject RE: Question about recommended heap sizes
Date Wed, 24 Sep 2008 23:56:07 GMT
One thing to be aware of... 

Currently the HBase client serializes RPC calls for a process, so you are
not getting true insert parallelism if all inserts are coming from a single
java process despite the threading.

Since you are also experiencing this, there must be something going on here.
In 0.1.3 I had been importing far more and never had to increase the heap,
up to hundreds of regions per server.

We will be investigating this issue further... I have filed an issue here:

Stay tuned there for progress.

Let us know how your further testing goes.



-----Original Message-----
From: Daniel Ploeg [mailto:dploeg@gmail.com] 
Sent: Wednesday, September 24, 2008 4:35 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Question about recommended heap sizes


Thanks for your quick responses!

I'm using HBase 0.18.0.

I restarted the hbase cluster and it's telling me on the master's web page
that I have a total of 39 regions.

I was using 100 threads to push data into Hbase, so I might try reducing
that to, say, 20 on the next run. I'll also try with the heap at 2GB. If
that fails again I'll  reduce the batch size to 1K and try again.

I should note that I've tuned the configurations of hadoop with the
following based on the troubleshooting guide and the related jiras:


Thu, Sep 25, 2008 at 9:19 AM, Jonathan Gray <jlist@streamy.com> wrote:

> Daniel,
> I have seen similar issues during large scale imports.  For now, we have
> gotten around the issue by increasing the regionserver heap size to 2GB.
>  My
> slave machines also have 4GB of memory.
> How many total regions did you have when you received the OOME?
> Jonathan Gray
> -----Original Message-----
> From: Daniel Ploeg [mailto:dploeg@gmail.com]
> Sent: Wednesday, September 24, 2008 3:55 PM
> To: hbase-user@hadoop.apache.org
> Subject: Question about recommended heap sizes
> Hi all,
> I was running a test on our local hbase cluster (1 master node, 4 region
> servers) and I ran into some OutOfMemory exceptions. Basically, one of the
> region servers went down first, then the master node followed (ouch!) as I
> was inserting the data for the test.
> I was still using the default heap size and I would like to get some
> recommendations as to what I should raise it to. My regionservers each
> 4GB and the master node has 8GB. It may be useful if I describe the tests
> that I was trying to do, so here goes:
> The tests were to ramp up the amount of rows to determine the query
> of my particular usage pattern. Each level of testing has a different
> number
> of rows (1K, 10K and 100K). My exception occurred on the 10K row data
> population (about 3300 rows in).
> My data is a table with a single column family with 10K column instances
> per
> row. Each column contains approx 500-1000 bytes of data.
> I should note that the first level of testing with 1K rows were returning
> average query responses of approx 240ms.
> Could someone please advise on how large you think I should set my heap
> space (and if you think I should make any mods to hadoop heap as well).
> Thanks,
> Daniel

View raw message