hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: BucketCache Configuration
Date Mon, 21 Jul 2014 21:05:38 GMT
No. You need 0.96.x HBase at least.
St.Ack


On Mon, Jul 21, 2014 at 9:42 AM, Jane Tao <jiao.tao@oracle.com> wrote:

> Hi Stack,
>
> Does what you suggested apply to HBase 0.94.6?
>
> Thanks,
> Jane
>
>
> On 7/18/2014 5:11 PM, Stack wrote:
>
>> On Fri, Jul 18, 2014 at 4:46 PM, Jane Tao <jiao.tao@oracle.com> wrote:
>>
>>  Hi there,
>>>
>>> Our goal is to fully utilize the free RAM on each node/region server for
>>> HBase. At the same time, we do not want to incur too much pressure from
>>> GC
>>> (garbage collection). Based on Ted's sugguestion, we are trying to using
>>> bucket cache.
>>>
>>> However, we are not sure:
>>>
>>>  Sorry.  Config is a little complicated at the moment.  It has had some
>> cleanup in trunk.  Meantime...
>>
>>
>>
>>  - The relation between XX:MaxDirectMemorySize and java heap size. Is
>>> MaxDirectMemorySize part of java heap size ?
>>>
>>>
>> No.  It is the maximum for how much the JVM should use OFFHEAP.  Here is a
>> bit of a note I just added to the refguide:
>>
>>
>>                   <para>The default maximum direct memory varies by JVM.
>>   Traditionally it is 64M
>>                       or some relation to allocated heap size (-Xmx) or no
>> limit at all (JDK7 apparently).
>>                       HBase servers use direct memory, in particular
>> short-circuit reading, the hosted DFSClient will
>>                       allocate direct memory buffers.  If you do offheap
>> block caching, you'll
>>                       be making use of direct memory.  Starting your JVM,
>> make sure
>>                       the <varname>-XX:MaxDirectMemorySize</varname>
>> setting
>> in
>>                       <filename>conf/hbase-env.sh</filename> is set to
>> some
>> value that is
>>                       higher than what you have allocated to your offheap
>> blockcache
>>                       (<varname>hbase.bucketcache.size</varname>).  It
>> should be larger than your offheap block
>>                       cache and then some for DFSClient usage (How much
>> the
>> DFSClient uses is not
>>                       easy to quantify; it is the number of open hfiles *
>> <varname>hbase.dfs.client.read.shortcircuit.buffer.size</varname>
>>                       where hbase.dfs.client.read.
>> shortcircuit.buffer.size
>> is set to 128k in HBase -- see <filename>hbase-default.xml</filename>
>>                       default configurations).
>>                   </para>
>>
>>
>>
>>  - The relation between XX:MaxDirectMemorySize and hbase.bucketcache.size.
>>> Are they equal?
>>>
>>>  XX:MaxDirectMemorySize should be larger than hbase.bucketcache.size.
>>  They
>> should not be equal.  See note above for why.
>>
>>
>>
>>  - How to adjust hbase.bucketcache.percentage.in.combinedcache?
>>>
>>>
>>>  Or just leave it as is.  To adjust, just set it to other than the
>> default
>> which is 0.9 (0.9 of hbase.bucketcache.size).  This configuration has been
>> removed from trunk because it is confusing.
>>
>>
>>
>>  Right now, we have the following configuration. Does it make sense?
>>>
>>> - java heap size of each hbase region server to 12 GB
>>> - -XX:MaxDirectMemorySize to be 6GB
>>>
>>>  Why not set it to 48G since you have the RAM?
>>
>>
>>
>>  - hbase-site.xml :
>>>    <property>
>>>      <name>hbase.offheapcache.percentage</name>
>>>      <value>0</value>
>>>    </property>
>>>
>>>  This setting is not needed.  0 is the default.
>>
>>
>>     <property>
>>>      <name>hbase.bucketcache.ioengine</name>
>>>      <value>offheap</value>
>>>    </property>
>>>    <property>
>>> <name>hbase.bucketcache.percentage.in.combinedcache</name>
>>>      <value>0.8</value>
>>>    </property>
>>>
>>>  Or you could just undo this setting and go with the default which is
>> 0.9.
>>
>>
>>     <property>
>>>      <name>hbase.bucketcache.size</name>
>>>      <value>6144</value>
>>>    </property>
>>>
>>>
>>>  Adjust this to be 40000? (smile).
>> Let us know how it goes.
>>
>> What version of HBase you running?  Thanks.
>>
>> St.Ack
>>
>>
>>
>>  Thanks,
>>> Jane
>>>
>>>
>>> On 7/17/2014 3:05 PM, Ted Yu wrote:
>>>
>>>  Have you considered using BucketCache ?
>>>>
>>>> Please read 9.6.4.1 under
>>>> http://hbase.apache.org/book.html#regionserver.arch
>>>>
>>>> Note: remember to verify the config values against the hbase release
>>>> you're
>>>> using.
>>>>
>>>> Cheers
>>>>
>>>>
>>>> On Thu, Jul 17, 2014 at 2:53 PM, Jane Tao <jiao.tao@oracle.com> wrote:
>>>>
>>>>   Hi Ted,
>>>>
>>>>> In my case, there is a 6 Node HBase cluster setup (running on Oracle
>>>>> BDA).
>>>>> Each node has plenty of RAM (64GB) and CPU cores. Several articles seem
>>>>> to
>>>>> suggest
>>>>> that it is not a good idea to allocate too much RAM to region server's
>>>>> heap setting.
>>>>>
>>>>> If each region server has 10GB heap and there is only one region server
>>>>> per node, then
>>>>> I have 10x6=60GB for the whole HBase. This setting is good for ~100M
>>>>> rows
>>>>> but starts
>>>>> to incur lots of GC activities on region servers when loading billions
>>>>> of
>>>>> rows.
>>>>>
>>>>> Basically, I need a configuration that can fully utilize the free RAM
>>>>> on
>>>>> each node for HBase.
>>>>>
>>>>> Thanks,
>>>>> Jane
>>>>> On 7/16/2014 4:17 PM, Ted Yu wrote:
>>>>>
>>>>>   Jane:
>>>>>
>>>>>> Can you briefly describe the use case where multiple region servers
>>>>>> are
>>>>>> needed on the same host ?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 16, 2014 at 3:14 PM, Dhaval Shah <
>>>>>> prince_mithibai@yahoo.co.in
>>>>>> wrote:
>>>>>>
>>>>>>    Its certainly possible (atleast with command line) but probably
>>>>>> very
>>>>>>
>>>>>>  messy. You will need to have different ports, different log files,
>>>>>>> different pid files, possibly even different configs on the same
>>>>>>> machine.
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Dhaval
>>>>>>>
>>>>>>>
>>>>>>> ________________________________
>>>>>>>     From: Jane Tao <jiao.tao@oracle.com>
>>>>>>> To: user@hbase.apache.org
>>>>>>> Sent: Wednesday, 16 July 2014 6:06 PM
>>>>>>> Subject: multiple region servers at one machine
>>>>>>>
>>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> Is it possible to run multiple region servers at one machine/node?
If
>>>>>>> this is possible, how to start multiple region servers with command
>>>>>>> lines or cloudera manager?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jane
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>>   --
>>>>>>>
>>>>>>
>>>>>
>>>>>  --
>>>
>>>
>>>
> --
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message