incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Block-Cache and usage
Date Thu, 27 Mar 2014 14:17:21 GMT
On Thu, Mar 27, 2014 at 8:27 AM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> Aaron, I have another doubt regarding block-cache V1 & V2.
>
> V1 uses slabs/blocks approach, while V2 uses a file-extn->byte[] mapping.
>

It's not exactly a byte[].  Default configuration is a UnsafeCacheValue
wrapped by a DetachableCacheValue.  The UnsafeCacheValue allocates off heap
memory.  The DetachableCacheValue handles when the CacheValue is still in
use by some reference but needs to be evicted from the block cache.  So
when this occurs it copies the data from the UnsafeCacheValue (off heap) to
a ByteArrayCacheValue (on heap) and then the JVM handles things from
there.  However this situation doesn't happen very often.

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=blur-store/src/main/java/org/apache/blur/store/blockcache_v2/cachevalue/UnsafeCacheValue.java;h=6f230fd94f37de994841f0df96086906bf7ea1ae;hb=9de8a93aef77aa6bd8f91e8bfafc647a3a806ed0

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=blur-store/src/main/java/org/apache/blur/store/blockcache_v2/cachevalue/DetachableCacheValue.java;h=959e747b5d857daeee2b35fbb151e7a1581774e6;hb=9de8a93aef77aa6bd8f91e8bfafc647a3a806ed0


>
> I looked at CacheIndexInput, where if the byte[] is fully filled, we
> de-alloc it and alloc again. [releaseCache()/fillCache() methods]
>

Actually we don't.  If you look at this code:

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=blur-store/src/main/java/org/apache/blur/store/blockcache_v2/BaseCache.java;h=4805d40e74759c30ca9ae2016ecd3343eedc6b9e;hb=9de8a93aef77aa6bd8f91e8bfafc647a3a806ed0#l65

You will see that we return the allocated memory to a pool once it has been
evicted from the cache.  The references in the CacheIndexInput doesn't
actually manage the memory, it only holds references to the cache.


>
> This de-alloc/alloc will be pretty heavy when the configured cache-size
> becomes large right? Ex: 300 MB of cache for FDT file getting filled, then
> de-alloc/alloc of 300 MB again...
>

Well we never cache 300 MB at once.  Since Lucene is heavy on random
access, it typically accesses small portions of the files rapidly.  And the
cache system caches 8 KB (by default) of file data at a time.  So if a file
has 300 MB present in the cache it is very likely that it is not in
contiguous memory.

Hope this helps.

Aaron


>
> I am not sure I understand the cache-logic correctly, so need your help
> here...
>
> --
> Ravi
>
>
>
> On Sun, Mar 23, 2014 at 8:24 PM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> > No.  Typically the hit to miss ratio is very high, its a metric that is
> >> recorded in Blur
> >
> >
> > This is such a handy feature. Thanks for providing such detailed metrics
> >
> > Just to add to the benefits of block-cache, I just found out that
> > readFully or sync(seek+read) in FSDataInputStream occurs entirely in a
> > synchronized method in hadoop that could limit throughput/QPS when
> multiple
> > IndexInputs are open for same lucene file.
> >
> > Block-cache should shine in such scenarios...
> >
> > Thanks a lot for your inputs.
> >
> > --
> > Ravi
> >
> >
> > On Thu, Mar 20, 2014 at 5:45 PM, Aaron McCurry <amccurry@gmail.com>
> wrote:
> >
> >> On Wed, Mar 19, 2014 at 1:57 PM, Ravikumar Govindarajan <
> >> ravikumar.govindarajan@gmail.com> wrote:
> >>
> >> > One obvious case is a cache-hit scenario, where instead of using the
> >> > block-cache, there is a fairly heavy round-trip to data-node. It is
> also
> >> > highly likely that the data-node might have evicted the hot-pages due
> to
> >> > other active reads.
> >>
> >>
> >> Or writes.  The normal behavior in the Linux filesystem cache is to
> cache
> >> newly written data and evict the oldest data from memory.  So during
> >> merges
> >> (or any other writes from other Hadoop processes) the Linux filesystem
> >> will
> >> unload pages that you might be using.
> >>
> >>
> >> >
> >>
> >>
> >> > How much of cache-hit happens in Blur? Will I be correct in saying
> that
> >> > repeated terms occurring in search only will benefit block-cache?
> >> >
> >>
> >> No.  Typically the hit to miss ratio is very high, its a metric that is
> >> recorded in Blur (you can access via the blue shell by running the top
> >> command).  It's not unusual to see hits in the 5000-10000/s range with a
> >> block size of 64KB and misses occurring at the same time between
> 10-20/s.
> >>  This has a lot to due with how Lucene stores it's indexes, they are
> >> highly
> >> compressed files (although not compressed with a generic compression
> >> scheme).
> >>
> >>
> >> Let me know if you any other questions.
> >>
> >> Aaron
> >>
> >> >
> >> > --
> >> > Ravi
> >> >
> >> >
> >> > On Wed, Mar 19, 2014 at 11:06 PM, Ravikumar Govindarajan <
> >> > ravikumar.govindarajan@gmail.com> wrote:
> >> >
> >> > > I was looking at block-cache code and trying to understand why we
> need
> >> > it.
> >> > >
> >> > > We divide the file into blocks of 8KB and write to hadoop. While
> >> reading,
> >> > > we only read in batches of 8KB and store in block-cache
> >> > >
> >> > > This is a form of read-ahead caching on the
> >> client-side[shard-server]. Am
> >> > > I correct in understanding?
> >> > >
> >> > > Recent releases of hadoop have a notion of read-ahead caching in
> >> > data-node
> >> > > itself. The default value is 4MB but I believe it can also be
> >> configured
> >> > to
> >> > > whatever is needed.
> >> > >
> >> > > What are the advantages of a block-cache vis-a-vis data-node
> >> read-ahead
> >> > > cache?
> >> > >
> >> > > I also am not familiar with hadoop IO sub-system as to whether it's
> >> > > correct and performant to do read-aheads in data-nodes for a
> use-case
> >> > like
> >> > > lucene.
> >> > >
> >> > > Can someone help me?
> >> > >
> >> > > --
> >> > > Ravi
> >> > >
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message