incubator-blur-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Blur Wiki] Update of "BlockCacheConfiguration" by AaronMcCurry
Date Sun, 30 Sep 2012 23:25:29 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Blur Wiki" for change notification.

The "BlockCacheConfiguration" page has been changed by AaronMcCurry:

New page:
== Why ==

HDFS is a great filesystem for streaming large amounts data across large scale clusters. 
However the random access latency is typically the same performance you would get in reading
from a local drive if the data you are trying to access is not in the operating systems file
cache.  In other words every access to HDFS is similar to a local read with a cache miss.
 There have been great performance boosts in HDFS over the past few years but it still can't
perform at the level that a search engine needs.

Now you might be thinking that Lucene reads from the local hard drive and performs great,
so why wouldn't HDFS perform fairly well on it's own?  However most of time the Lucene index
files are cached by the operating system's file system cache.  So Blur has it's own file system
cache allows it to perform low latency data look-ups against HDFS.

== How ==

On shard server start-up Blur creates 1 or more block cache slabs `blur.shard.blockcache.slab.count`
that are each 128 MB in size.  These slabs can be allocated on or off the heap ``.
 Each slab is broken up into 16,384 blocks with each block size being 8K.  Then on the heap
there is a concurrent LRU cache that tracks what blocks of what files are in which slab(s)
at what offset.  So the more slabs of cache you create the more entries there will be in the
LRU thus more heap.

== Configuration ==


Say the shard server(s) that you are planning to run Blur on have 32G of ram.  These machines
are probably also running HDFS data nodes as well with very high xcievers (`dfs.datanode.max.xcievers`
in `hdfs-site.xml`) say 8K.  If the data nodes are configured with 1G of heap then they may
consume up to 4G of memory due to the high thread count because of the xcievers.  Next let's
say you configure Blur to 4G of heap as well, and you want to use 12G of off heap cache.

In the `` file you would need to change `BLUR_SHARD_JVM_OPTIONS` to include `"-XX:MaxDirectMemorySize=13g"`
and possibly `"-XX:+UseLargePages"` depending on your Linux setup.  I set the MaxDirectMemorySize
to more than 12G to make sure we don't hit the maximum limit and cause a OOM exception, this
does not reserve 13G it's a control to not allow more than that.  Below is a working example,
it also contains GC logging and GC configuration:

     export BLUR_SHARD_JVM_OPTIONS="-XX:MaxDirectMemorySize=13g \
                                    -XX:+UseLargePages \
                                    -Xms4g \
                                    -Xmx4g \
                                    -Xmn512m \
                                    -XX:+UseCompressedOops \
                                    -XX:+UseConcMarkSweepGC \
                                    -XX:+CMSIncrementalMode \
                                    -XX:CMSIncrementalDutyCycleMin=10 \
                                    -XX:CMSIncrementalDutyCycle=50 \
                                    -XX:ParallelGCThreads=8 \
                                    -XX:+UseParNewGC \
                                    -XX:MaxGCPauseMillis=200 \
                                    -XX:GCTimeRatio=10 \
                                    -XX:+DisableExplicitGC \
                                    -verbose:gc \
                                    -XX:+PrintGCDetails \
                                    -XX:+PrintGCDateStamps \
                                    -Xloggc:$BLUR_HOME/logs/gc-blur-shard-server_`date +%Y%m%d_%H%M%S`.log"

Next you will need to setup `` by changing `blur.shard.blockcache.slab.count`
to 96.  This is telling blur to allocate 96 128MB slabs of memory at shard server start-up.
 Note, that the first time you do this that the shard servers may take long time to allocate
the memory.  This is because the OS could be using most of that memory for it's own filesystem
caching and it will need to unload it which may cause some IO due the cache synching to disk.

Also the `` is set to true by default, this
will tell the JVM to try and allocate the memory off heap.  If you want to run the slabs in
the heap (which is not recommended) set this value to false.

== What To Cache ==

The TableDescriptor in the Thrift API contains 2 properties, `blockCaching` (boolean) and
`blockCachingFileTypes` (Set<String>).  You may disable block caching for a given table
by setting blockCaching = false, by deafult it's set to true.  To control the blockCachingFileTypes
create a set with the given Lucene file type extensions that you wish to cache.  If you leave
this null the default is to cache ALL Lucene file types except for the FDT and FDX file types
which are used for data retrieval only and are not accessed during the search itself.  This
has proven to provide excellent search performance while balancing memory constraints.

View raw message