hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "donhoff_h" <165612...@qq.com>
Subject 回复: BucketCache Configuration Problem
Date Tue, 03 Mar 2015 09:02:46 GMT
Hi, Stack

Still thanks much for your quick reply.

The reason that we don't shrink the heap and allocate the savings to the offheap is that we
want to cache datablocks as many as possible. The memory size is limited. No matter how much
we shrink it can not store so many datablocks. So we want to try "FILE" instead of "offheap".
And yes, in this situation we are considering using SSD.

As to my configuration, I have attached them in this mail.  Thanks very for trying my config.

I did not read the post you recommended. I will read it carefully. Perhaps I can make my decision
according to this post.

By the way, I also asked my colleagues to try HBase1.0. But we found that we could not start
the master and regionserver on the same node. (Because we are making tests, we deploy HBASE
on a very small cluster. We hope the nodes that run master and backup-master also run the
regionserver).  I read the ref-guide again. And it seems that the master and regionserver
use the same port 16020. Is that mean that there is no way that I can deploy the master and
regionserver on the same node and start them with a simple "start-hbase.sh" command?

Thanks!



------------------ 原始邮件 ------------------
发件人: "Stack";<stack@duboce.net>;
发送时间: 2015年3月3日(星期二) 中午1:30
收件人: "Hbase-User"<user@hbase.apache.org>; 

主题: Re: BucketCache Configuration Problem



On Mon, Mar 2, 2015 at 6:54 PM, donhoff_h <165612158@qq.com> wrote:

> Hi, Stack
>
> Thanks very much for your quick reply.
>
> It's OK to tell you my app sinario. I am not sure if it is very extreme. :)
>
> Recently we consider using HBase to store all the pictures of our bank.
> For example , the pictures from loan apply & contract, credit card apply,
> draft etc. Usually the total count of pictures that the business users will
> access daily is not very high. But since each picture takes a lot of space,
> the total amount of pictures that business users will access daily is very
> high. So we considing using cache to improve the performance.
>
>
Pictures are compressed, right?



> Since the total space daily accessed is very large and the memory may not
> contain so much space, we consider use "FILE" instead of "offheap" for the
> BucketCache.


Ok. Is your FILE located on an SSD? If not, FILE is probably a suboptimal
option.  Only reason I could see you favoring FILE is if you want your
cache to be 'warm' on startup (the FILE can be persisted and reopened
across restarts)



> At this situation if we use CBC policy, it will lead to that the memory
> will only cache the meta blocks and leave the datablocks stored in "FILE'.
> And it seems the memory is not taking full usage, because the total count
> of pictures daily accessed is not very high and we may not need to cache
> many meta blocks.


Right. Onheap, we'll only have the indices and blooms. Offheap or in FILE,
we'll have the data blocks.

The issue is that you believe your java heap is going to waste? If so,
shrink the JVM heap -- less memory to GC -- and allocate the savings to the
offheap (or to the os and let it worry about caching).



> So we want to know if the RAW L1+L2 is better. Maybe it can take full use
> of memory and meanwhile cache a lot of datablocks. That's the reason why I
> tried to setup a RAW L1+L2 configuration.
>
>
You've seen this post I presume:
https://blogs.apache.org/hbase/entry/comparing_blockcache_deploys

RAW L1+L2 is not tested. Blocks are first cached in L1 and if evicted, they
go to L2. Going first into the java heap (L1) and then out to L2 could make
for some ugly churn if blocks are being flipped frequently between the two
tiers. We used to have a "SlabCache" option and it had a similar policy;
all it seemed to do in testing was run slow and generate GC garbage so we
removed it (and passed on testing L1+L2 RAW)

High-level, it sounds like you cannot cache the dataset in memory and that
you will have some cache churn over the day; in this case, CBC and a
shrunken java heap with the savings given over to offheap or the os cache
would seem to be the way to go?  Your fetches will be slower than if you
could cache it all onheap but you should have a nice GC profile and fairly
predicatable latency.



> By the way, I tried your advice to set
> hbase.bucketcache.combinedcache.enabled=fales. But the WebUI of the
> regionserver said that I did not deploy L2 cache. You can see it in the
> attachement. Is that still caused by my HBase Version?  Is RAW L1+L2 only
> applicable in HBase1.0?
>
> Send me your config and I'll try it here.



> At last, you said that the refguide still needs to be maintained. I
> totally understand.  :) It is the same in my bank. We also have a lot of
> docs that need to be updated. And I shall be very glad if my questions can
> help you to find those locations and can help others.
>
>
Smile. Thanks for being understanding.
St.Ack



> Thanks again!‍
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Stack";<stack@duboce.net>;
> *发送时间:* 2015年3月3日(星期二) 凌晨0:13
> *收件人:* "Hbase-User"<user@hbase.apache.org>;
> *主题:* Re: BucketCache Configuration Problem
>
> On Mon, Mar 2, 2015 at 6:19 AM, donhoff_h <165612158@qq.com> wrote:
>
> > Hi, Stack
> >
> > Thanks for your reply and also thanks very much for your reply to my
> > previous mail "Questions about BucketCache".
> >
> > The hbase.bucketcache.bucket.sizes property takes effect. I did not
> notice
> > that in your first mail you have told me the name should be
> > hbase.bucketcache.bucket.sizes, instead of hbase.bucketcache.sizes. I
> > noticed this point until your last mail. It's my fault. Thanks for your
> > patient.
> >
> >
> No harm. Thanks for your patience and writing the list.
>
>
>
> > As to the relationship between HBASE_OFFHEAPSIZE and
> > -XX:MaxDirectMemorySize, I followed your advice to find in the bin/hbase
> > the statements that contain HBASE_OFFHEAPSIZE, but I found that there
> isn't
> > any statement which contains HBASE_OFFHEAPSIZE. I also tried
> > "bin/hbase-daemon.sh" and "bin/hbase-config.sh", they don't contain
> > HBASE_OFFHEAPSIZE either. So I still don't know their relationship. My
> > HBase version is 0.98.10. Is HBASE_OFFHEAPSIZE not used in this version ?
> >
> >
> This is my fault. The above applies to versions beyond 0.98, not your
> version. Please pass MaxDirectMemorySize inside HBASE_OPTS.
>
>
>
>
> > As to "the pure secondary cache" or "bypass CBC", I mean use BucketCache
> > as a strict L2 cache to the L1 LruBlockCache, ie the Raw L1+L2. The point
> > comes from the reference guide of Apache HBase, which says : "It is
> > possible to deploy an L1+L2 setup where we bypass the CombinedBlockCache
> > policy and have BucketCache working as a strict L2 cache to the L1
> > LruBlockCache. For such a setup, set
> CacheConfig.BUCKET_CACHE_COMBINED_KEY
> > to false. In this mode, on eviction from L1, blocks go to L2. When a
> block
> > is cached, it is cached first in L1. When we go to look for a cached
> block,
> > we look first in L1 and if none found, then search L2. Let us call this
> > deploy format, Raw L1+L2."
> > I want to configure this kind of cache not because the CBC poliy is not
> > good, but because I am a tech-leader in a bank. I need to compare these
> two
> > kinds of cache to make a decision for our differenct kinds of apps. The
> > reference guide said I can configure in
> > "CacheConfig.BUCKET_CACHE_COMBINED_KEY‍", but is there anyway I can
> > configure in hbase-site.xml?
> >
> >
> I see.
>
> BUCKET_CACHE_COMBINED_KEY == hbase.bucketcache.combinedcache.enabled.  Set
> it to false in your hbase-site.xml.
>
> But lets backup and let me save you some work. Such an 'option' should be
> struck from the refguide as an exotic permutation that only in the most
> extreme of cases would perform better than CBC. Do you have a loading where
> you think this combination would be better suited? If so, would you mind
> telling us of it?
>
> Meantime, we need to work on posting different versions of our doc so
> others don't have the difficult time you have had above. We've been lazy up
> to this because the doc was generally applicable but bucketcache is one of
> the locations where version matters.
>
> Yours,
> St.Ack
>
>
>
>
>
>
> > Many Thanks!‍
> >
> >
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "Stack";<stack@duboce.net>;
> > 发送时间: 2015年3月2日(星期一) 下午2:30
> > 收件人: "Hbase-User"<user@hbase.apache.org>;
> >
> > 主题: Re: BucketCache Configuration Problem
> >
> >
> >
> > On Sun, Mar 1, 2015 at 12:57 AM, donhoff_h <165612158@qq.com> wrote:
> >
> > > Hi, experts
> > >
> > > I am using HBase0.98.10 and have 3 problems about BucketCache
> > > configuration.
> > >
> > > First:
> > > I  read the reference guide of Apache HBase to learn how to config
> > > BucketCache. I find that when using offheap BucketCache, the reference
> > > guide says that I should config the HBASE_OFFHEAPSIZE , it also says
> > that
> > > I should config -XX:MaxDirectMemorySize. Since these two parameters
> are
> > > both related to the DirectMemory, I am confused which one should I
> > > configure?
> > >
> > >
> > See bin/hbase how HBASE_OFFHEAPSIZE gets interpolated as value of the
> > -XX:MaxDirectMemorySize passed to java (so set HBASE_OFFHEAPSIZE) (will
> fix
> > the doc so more clear)
> >
> >
> >
> > > Second:
> > > I want to know how to configure the  BucketCache as a pure secondary
> > > cache, which I mean to bypass the  CombinedBlockCache policy. I tried
> to
> > > configure as following , but when I  go to the regionserver's webUI, I
> > > found it says "No L2 deployed"
> > >
> > > hbase.bucketcache.ioengine=offheap
> > > hbase.bucketcache.size=200
> > > hbase.bucketcache.combinedcache.enabled=false
> > >
> > >
> > What do you mean by pure secondary cache? Which block types do you want
> in
> > the bucketcache?
> >
> > Why bypass CBC? We've been trying to simplify bucketcache deploy. Part of
> > this streamlining has been removing the myriad options because they tend
> to
> > confuse and give user a few simple choices. Do the options not work for
> > you?
> >
> >
> >
> > > Third:
> > > I  made the following configuration to set the Bucket Sizes. But from
> > > regionserver's WebUI, I found that (4+1)K and (8+1)K sizes are used,
> > > (64+1)K sizes are not used. What's wrong with my configuration?
> > > hbase.bucketcache.ioengine=offheap
> > > hbase.bucketcache.size=200
> > > hbase.bucketcache.combinedcache.enabled=true
> > > hbase.bucketcache.sizes=65536  hfile.block.cache.sizes=65536 I
> configured
> > > these two both for I don't  know which one is in use now.
> > >
> > >
> > As per previous mail (and HBASE-13125), hfile.block.cache.sizes has no
> > effect in 0.98.x.  Also per our previous mail "Questions about
> > BucketCache", isn't it hbase.bucketcache.bucket.sizes that you want?
> >
> > You have tried the defaults and they do not fit your access patten?
> >
> > Yours,
> > St.Ack
> >
> >
> > > Many Thanks!‍
> >
>
>
Mime
  • Unnamed multipart/mixed (inline, 8-Bit, 0 bytes)
View raw message