hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: BucketCache Configuration
Date Sat, 19 Jul 2014 00:11:42 GMT
On Fri, Jul 18, 2014 at 4:46 PM, Jane Tao <jiao.tao@oracle.com> wrote:

> Hi there,
>
> Our goal is to fully utilize the free RAM on each node/region server for
> HBase. At the same time, we do not want to incur too much pressure from GC
> (garbage collection). Based on Ted's sugguestion, we are trying to using
> bucket cache.
>
> However, we are not sure:
>

Sorry.  Config is a little complicated at the moment.  It has had some
cleanup in trunk.  Meantime...



> - The relation between XX:MaxDirectMemorySize and java heap size. Is
> MaxDirectMemorySize part of java heap size ?
>


No.  It is the maximum for how much the JVM should use OFFHEAP.  Here is a
bit of a note I just added to the refguide:


                 <para>The default maximum direct memory varies by JVM.
 Traditionally it is 64M
                     or some relation to allocated heap size (-Xmx) or no
limit at all (JDK7 apparently).
                     HBase servers use direct memory, in particular
short-circuit reading, the hosted DFSClient will
                     allocate direct memory buffers.  If you do offheap
block caching, you'll
                     be making use of direct memory.  Starting your JVM,
make sure
                     the <varname>-XX:MaxDirectMemorySize</varname> setting
in
                     <filename>conf/hbase-env.sh</filename> is set to some
value that is
                     higher than what you have allocated to your offheap
blockcache
                     (<varname>hbase.bucketcache.size</varname>).  It
should be larger than your offheap block
                     cache and then some for DFSClient usage (How much the
DFSClient uses is not
                     easy to quantify; it is the number of open hfiles *
<varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname>
                     where hbase.dfs.client.read.shortcircuit.buffer.size
is set to 128k in HBase -- see <filename>hbase-default.xml</filename>
                     default configurations).
                 </para>



> - The relation between XX:MaxDirectMemorySize and hbase.bucketcache.size.
> Are they equal?
>

XX:MaxDirectMemorySize should be larger than hbase.bucketcache.size.  They
should not be equal.  See note above for why.



> - How to adjust hbase.bucketcache.percentage.in.combinedcache?
>
>
Or just leave it as is.  To adjust, just set it to other than the default
which is 0.9 (0.9 of hbase.bucketcache.size).  This configuration has been
removed from trunk because it is confusing.



> Right now, we have the following configuration. Does it make sense?
>
> - java heap size of each hbase region server to 12 GB
> - -XX:MaxDirectMemorySize to be 6GB
>

Why not set it to 48G since you have the RAM?



> - hbase-site.xml :
>   <property>
>     <name>hbase.offheapcache.percentage</name>
>     <value>0</value>
>   </property>
>

This setting is not needed.  0 is the default.


>   <property>
>     <name>hbase.bucketcache.ioengine</name>
>     <value>offheap</value>
>   </property>
>   <property>
> <name>hbase.bucketcache.percentage.in.combinedcache</name>
>     <value>0.8</value>
>   </property>
>

Or you could just undo this setting and go with the default which is 0.9.


>   <property>
>     <name>hbase.bucketcache.size</name>
>     <value>6144</value>
>   </property>
>
>
Adjust this to be 40000? (smile).
Let us know how it goes.

What version of HBase you running?  Thanks.

St.Ack



> Thanks,
> Jane
>
>
> On 7/17/2014 3:05 PM, Ted Yu wrote:
>
>> Have you considered using BucketCache ?
>>
>> Please read 9.6.4.1 under
>> http://hbase.apache.org/book.html#regionserver.arch
>>
>> Note: remember to verify the config values against the hbase release
>> you're
>> using.
>>
>> Cheers
>>
>>
>> On Thu, Jul 17, 2014 at 2:53 PM, Jane Tao <jiao.tao@oracle.com> wrote:
>>
>>  Hi Ted,
>>>
>>> In my case, there is a 6 Node HBase cluster setup (running on Oracle
>>> BDA).
>>> Each node has plenty of RAM (64GB) and CPU cores. Several articles seem
>>> to
>>> suggest
>>> that it is not a good idea to allocate too much RAM to region server's
>>> heap setting.
>>>
>>> If each region server has 10GB heap and there is only one region server
>>> per node, then
>>> I have 10x6=60GB for the whole HBase. This setting is good for ~100M rows
>>> but starts
>>> to incur lots of GC activities on region servers when loading billions of
>>> rows.
>>>
>>> Basically, I need a configuration that can fully utilize the free RAM on
>>> each node for HBase.
>>>
>>> Thanks,
>>> Jane
>>> On 7/16/2014 4:17 PM, Ted Yu wrote:
>>>
>>>  Jane:
>>>> Can you briefly describe the use case where multiple region servers are
>>>> needed on the same host ?
>>>>
>>>> Cheers
>>>>
>>>>
>>>>
>>>> On Wed, Jul 16, 2014 at 3:14 PM, Dhaval Shah <
>>>> prince_mithibai@yahoo.co.in
>>>> wrote:
>>>>
>>>>   Its certainly possible (atleast with command line) but probably very
>>>>
>>>>> messy. You will need to have different ports, different log files,
>>>>> different pid files, possibly even different configs on the same
>>>>> machine.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Dhaval
>>>>>
>>>>>
>>>>> ________________________________
>>>>>    From: Jane Tao <jiao.tao@oracle.com>
>>>>> To: user@hbase.apache.org
>>>>> Sent: Wednesday, 16 July 2014 6:06 PM
>>>>> Subject: multiple region servers at one machine
>>>>>
>>>>>
>>>>> Hi there,
>>>>>
>>>>> Is it possible to run multiple region servers at one machine/node? If
>>>>> this is possible, how to start multiple region servers with command
>>>>> lines or cloudera manager?
>>>>>
>>>>> Thanks,
>>>>> Jane
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>>  --
>>>
>>>
>>>
> --
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message