hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianying Chang <tych...@gmail.com>
Subject Re: Will BloomFilter still be cached if setCacheBlocks(false) per Get()?
Date Fri, 18 Apr 2014 06:55:27 GMT
Ted, thanks, I am convinced that BLOOM is cached even when block cache
turned off per-family or per-query, because the code in
CompoundBloomFilter.java below. The highlighted "true" made sure the
cacheBlock is on for BLOOM

reader.readBlock(index.getRootBlockOffset(block),
            index.getRootBlockDataSize(block), true, true, false,
            BlockType.BLOOM_CHUNK);


On Thu, Apr 17, 2014 at 3:39 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Tianying:
> Please take a look at CacheConfig#shouldCacheBlockOnRead() which is called
> by HFileReaderV2#readBlock()
>
> Cheers
>
>
> On Wed, Apr 16, 2014 at 5:39 PM, Tianying Chang <tychang@gmail.com> wrote:
>
> > Cool. Thanks!
> >
> > Just to dig deeper,  is this because BloomFilter is part of Meta, and
> Meta
> > block always cached no matter what?
> >
> > Or it is because the BloomFilter is in the upper level of the searchTree
> in
> > the code path I pasted? I guess that code path is actually for data
> block,
> > not meta block?
> >
> > // Call HFile's caching block reader API. We always cache index
> >          // blocks, otherwise we might get terrible performance.
> >           boolean shouldCache = cacheBlocks || (lookupLevel <
> > searchTreeLevel);
> >           BlockType expectedBlockType;
> >           if (lookupLevel < searchTreeLevel - 1) {
> >             expectedBlockType = BlockType.INTERMEDIATE_INDEX;
> >           } else if (lookupLevel == searchTreeLevel - 1) {
> >             expectedBlockType = BlockType.LEAF_INDEX;
> >           } else {
> >             // this also accounts for ENCODED_DATA
> >             expectedBlockType = BlockType.DATA;
> >           }
> >
> >
> > On Wed, Apr 16, 2014 at 4:59 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > bq. it is always cached on read even when per-family/per-query
> > cacheBlocks
> > > is turned off.
> > >
> > > True.
> > >
> > >
> > > On Wed, Apr 16, 2014 at 4:41 PM, Tianying Chang <tychang@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > We have a use case where some data are mostly random read, so it
> > polluted
> > > > cache and caused big GC. It is better to turn off the block cache for
> > > those
> > > > data. So we are going to call setCacheBlocks(false) for those get().
> We
> > > > know that the index will be still cached based on below code path, so
> > we
> > > > are safe there.  But it is not clear if BloomFilter belong to the
> > level <
> > > > searchTreeLevel, and also get cached also.
> > > >
> > > >          // Call HFile's caching block reader API. We always cache
> > index
> > > >          // blocks, otherwise we might get terrible performance.
> > > >           boolean shouldCache = cacheBlocks || (lookupLevel <
> > > > searchTreeLevel);
> > > >           BlockType expectedBlockType;
> > > >           if (lookupLevel < searchTreeLevel - 1) {
> > > >             expectedBlockType = BlockType.INTERMEDIATE_INDEX;
> > > >           } else if (lookupLevel == searchTreeLevel - 1) {
> > > >             expectedBlockType = BlockType.LEAF_INDEX;
> > > >           } else {
> > > >             // this also accounts for ENCODED_DATA
> > > >             expectedBlockType = BlockType.DATA;
> > > >           }
> > > >
> > > > Or I think because BloomFilter is part of Meta data, so it is always
> > > cached
> > > > on read even when per-family/per-query cacheBlocks is turned off. Am
> I
> > > > right?
> > > >
> > > > Thanks
> > > > Tian-Ying
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message