incubator-blur-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From amccu...@apache.org
Subject [2/2] git commit: Updating the documentation.
Date Wed, 06 Nov 2013 15:34:34 GMT
Updating the documentation.


Project: http://git-wip-us.apache.org/repos/asf/incubator-blur/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-blur/commit/c128f091
Tree: http://git-wip-us.apache.org/repos/asf/incubator-blur/tree/c128f091
Diff: http://git-wip-us.apache.org/repos/asf/incubator-blur/diff/c128f091

Branch: refs/heads/apache-blur-0.2
Commit: c128f0917693e30d7f68dc0523675121f35f4112
Parents: a6aede3
Author: Aaron McCurry <amccurry@gmail.com>
Authored: Wed Nov 6 10:33:33 2013 -0500
Committer: Aaron McCurry <amccurry@gmail.com>
Committed: Wed Nov 6 10:33:33 2013 -0500

----------------------------------------------------------------------
 docs/cluster-setup.html | 67 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 60 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-blur/blob/c128f091/docs/cluster-setup.html
----------------------------------------------------------------------
diff --git a/docs/cluster-setup.html b/docs/cluster-setup.html
index 4dfeedc..6912e73 100644
--- a/docs/cluster-setup.html
+++ b/docs/cluster-setup.html
@@ -72,7 +72,12 @@
                 <ul class="nav">
                   <li><a href="#shard-blur-site">blur-site.properties</a></li>
                   <li><a href="#shard-blur-env">blur-env.sh</a></li>
-                  <li><a href="#block-cache">Block Cache Configuration</a></li>
+				  <li><a href="#block-cache">Block Cache</a>
+                    <ul class="nav">
+                      <li><a href="#block-cache-v2">&nbsp;&nbsp;V2 Block
Cache Configuration</a></li>
+                      <li><a href="#block-cache-v1">&nbsp;&nbsp;V1 Block
Cache Configuration</a></li>
+                    </ul>
+                  </li>
                 </ul>
               </li>
               <li>
@@ -152,7 +157,7 @@ export HADOOP_HOME=&lt;path to your Hadoop install directory&gt;</code>
 <h4>Default Properties</h4>
 <table class="table-bordered table-striped table-condensed">
 <tr><td>Property</td><td>Description</td></tr>
-<tr><td>blur.controller.hostname</td><td>Sets the hostname for the
controller, if blank the hostname is automatically detected</td></tr><tr><td>blur.controller.bind.address
(0.0.0.0)</td><td>The binding address of the controller</td></tr><tr><td>blur.controller.bind.port
(40010)</td><td>The default binding port of the controller server</td></tr><tr><td>blur.controller.shard.connection.timeout
(60000)</td><td>The connection timeout, NOTE: this will be the maximum amount
of time you can wait for a query.</td></tr><tr><td>blur.controller.server.thrift.thread.count
(32)</td><td>The number of threads used for thrift requests</td></tr><tr><td>blur.controller.server.remote.thread.count
(64)</td><td>The number of threads used for remote thrift requests to the shards
server.  This should be a large number.</td></tr><tr><td>blur.controller.thrift.selector.threads
(2)</td><td>The number of threads used for selector processing inside the thrift
server.</td></tr><tr><td>blur.controller.thrift.ma
 x.read.buffer.bytes (9223372036854775807)</td><td>The maximum number of bytes
used for reading requests in the thrift server.</td></tr><tr><td>blur.controller.thrift.accept.queue.size.per.thread
(4)</td><td>The size of the blocking queue per selector thread for passing accepted
connections to the selector thread.</td></tr><tr><td>blur.controller.remote.fetch.count
(100)</td><td>The number of hits to fetch per request to the shard servers</td></tr><tr><td>blur.controller.retry.max.fetch.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during fetch</td></tr><tr><td>blur.controller.retry.max.mutate.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during mutate</td></tr><tr><td>blur.controller.retry.max.default.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during all other request</td></tr><tr><td>blur.controller.retry.fetch.delay
(500)</td><td>The starting backoff 
 delay for the first retry for a fetch errors</td></tr><tr><td>blur.controller.retry.mutate.delay
(500)</td><td>The starting backoff delay for the first retry for a mutate errors</td></tr><tr><td>blur.controller.retry.default.delay
(500)</td><td>The starting backoff delay for the first retry for a all other request
errors</td></tr><tr><td>blur.controller.retry.max.fetch.delay (2000)</td><td>The
ending backoff delay for the last retry for a fetch errors</td></tr><tr><td>blur.controller.retry.max.mutate.delay
(2000)</td><td>The ending backoff delay for the last retry for a mutate errors</td></tr><tr><td>blur.controller.retry.max.default.delay
(2000)</td><td>The ending backoff delay for the last retry for a all other request
errors</td></tr><tr><td>blur.gui.controller.port (40080)</td><td>The
http status page port for the controller server</td></tr>
+<tr><td>blur.controller.hostname</td><td>Sets the hostname for the
controller, if blank the hostname is automatically detected</td></tr><tr><td>blur.controller.bind.address
(0.0.0.0)</td><td>The binding address of the controller</td></tr><tr><td>blur.controller.bind.port
(40010)</td><td>The default binding port of the controller server</td></tr><tr><td>blur.controller.shard.connection.timeout
(60000)</td><td>The connection timeout, NOTE: this will be the maximum amount
of time you can wait for a query.</td></tr><tr><td>blur.controller.server.thrift.thread.count
(32)</td><td>The number of threads used for thrift requests</td></tr><tr><td>blur.controller.server.remote.thread.count
(64)</td><td>The number of threads used for remote thrift requests to the shards
server.  This should be a large number.</td></tr><tr><td>blur.controller.thrift.selector.threads
(2)</td><td>The number of threads used for selector processing inside the thrift
server.</td></tr><tr><td>blur.controller.thrift.ma
 x.read.buffer.bytes (9223372036854775807)</td><td>The maximum number of bytes
used for reading requests in the thrift server.</td></tr><tr><td>blur.controller.thrift.accept.queue.size.per.thread
(4)</td><td>The size of the blocking queue per selector thread for passing accepted
connections to the selector thread.</td></tr><tr><td>blur.controller.remote.fetch.count
(100)</td><td>The number of hits to fetch per request to the shard servers</td></tr><tr><td>blur.controller.retry.max.fetch.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during fetch</td></tr><tr><td>blur.controller.retry.max.mutate.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during mutate</td></tr><tr><td>blur.controller.retry.max.default.retries
(3)</td><td>The max number of retries to the shard server when there is an error
during all other request</td></tr><tr><td>blur.controller.retry.fetch.delay
(500)</td><td>The starting backoff 
 delay for the first retry for a fetch errors</td></tr><tr><td>blur.controller.retry.mutate.delay
(500)</td><td>The starting backoff delay for the first retry for a mutate errors</td></tr><tr><td>blur.controller.retry.default.delay
(500)</td><td>The starting backoff delay for the first retry for a all other request
errors</td></tr><tr><td>blur.controller.retry.max.fetch.delay (2000)</td><td>The
ending backoff delay for the last retry for a fetch errors</td></tr><tr><td>blur.controller.retry.max.mutate.delay
(2000)</td><td>The ending backoff delay for the last retry for a mutate errors</td></tr><tr><td>blur.controller.retry.max.default.delay
(2000)</td><td>The ending backoff delay for the last retry for a all other request
errors</td></tr><tr><td>blur.gui.controller.port (40080)</td><td>The
http status page port for the controller server</td></tr><tr><td>blur.controller.filtered.server.class</td><td>To
intercept the calls made to the controller server and perform server side changes t
 o the calls extend org.apache.blur.server.FilteredBlurServer.</td></tr>
 </table>
             <h3 id="controller-blur-env">blur-env.sh</h3>
             <pre><code class="bash"># JAVA JVM OPTIONS for the controller servers,
jvm tuning parameters are placed here.
@@ -199,7 +204,7 @@ Swap can kill java perform, you may want to consider disabling swap.</div>
 			<h4>Default Properties</h4>
 			<table class="table-bordered table-striped table-condensed">
 			<tr><td>Property</td><td>Description</td></tr>
-<tr><td>blur.shard.hostname</td><td>The hostname for the shard, if
blank the hostname is automatically detected</td></tr><tr><td>blur.shard.bind.address
(0.0.0.0)</td><td>The binding address of the shard</td></tr><tr><td>blur.shard.bind.port
(40020)</td><td>The default binding port of the shard server</td></tr><tr><td>blur.shard.data.fetch.thread.count
(8)</td><td>The number of fetcher threads</td></tr><tr><td>blur.shard.server.thrift.thread.count
(8)</td><td>The number of the thrift threads</td></tr><tr><td>blur.shard.thrift.selector.threads
(2)</td><td>The number of threads used for selector processing inside the thrift
server.</td></tr><tr><td>blur.shard.thrift.max.read.buffer.bytes (9223372036854775807)</td><td>The
maximum number of bytes used for reading requests in the thrift server.</td></tr><tr><td>blur.shard.thrift.accept.queue.size.per.thread
(4)</td><td>The size of the blocking queue per selector thread for passing accepted
connections to the selector thread.</td></tr><tr
 ><td>blur.shard.opener.thread.count (8)</td><td>The number of threads
that are used for opening indexes</td></tr><tr><td>blur.shard.cache.max.querycache.elements
(128)</td><td>The number of cached queries</td></tr><tr><td>blur.shard.cache.max.timetolive
(60000)</td><td>The time to live for the cache query</td></tr><tr><td>blur.shard.filter.cache.class
(org.apache.blur.manager.DefaultBlurFilterCache)</td><td>Default implementation
of the blur cache filter, which is a pass through filter that does nothing</td></tr><tr><td>blur.shard.index.warmup.class
(org.apache.blur.manager.indexserver.DefaultBlurIndexWarmup)</td><td>Default Blur
index warmup class that warms the fields provided in the table descriptor</td></tr><tr><td>blur.shard.index.warmup.throttle
(30000000)</td><td>Throttles the warmup to 30MB/s across all the warmup threads</td></tr><tr><td>blur.shard.block.cache.version
(v2)</td><td>By default the v2 version of the block cache is enabled</td></tr><tr><td>blur.shard.block.cach
 e.total.size</td><td>By default the total amount of memory block cache will use
is -XX:MaxDirectMemorySize - 64 MiB</td></tr><tr><td>blur.shard.blockcache.direct.memory.allocation
(true)</td><td>v1 version of block cache only. By default the block cache using
off heap memory</td></tr><tr><td>blur.shard.blockcache.slab.count
(-1)</td><td>v1 version of block cache only. The slabs in the blockcache are automatically
configured by default (-1) otherwise 1 slab equals 128MB.  The auto config is detected through
the MaxDirectoryMemorySize provided to the JVM</td></tr><tr><td>blur.shard.block.cache.v2.fileBufferSize
(8192)</td><td>v2 version of block cache only. File buffer size, this is the buffer
size used to read and write to data to HDFS.  For production this will likely be increased.</td></tr><tr><td>blur.shard.block.cache.v2.cacheBlockSize
(8192)</td><td>v2 version of block cache only. The is the size of the blocks in
the off heap cache, it is good practice to have this match 'blur.s
 hard.block.cache.v2.fileBufferSize'.  For production this will likely be increased.</td></tr><tr><td>blur.shard.block.cache.v2.cacheBlockSize.filter
(33554432)</td><td>blur.shard.block.cache.v2.cacheBlockSize.<ext>=</td></tr><tr><td>blur.shard.block.cache.v2.store
(OFF_HEAP)</td><td>v2 version of block cache only. This is used to control if
the block are created on or off heap.  Values are OFF_HEAP | ON_HEAP</td></tr><tr><td>blur.shard.block.cache.v2.read.cache.ext</td><td>v2
version of block cache only. This specifies what file types should be cached during reads.
 Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.read.nocache.ext
(fdt)</td><td>v2 version of block cache only. This specifies what file types should
NOT be cached during reads.  Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.read.default
(true)</td><td>v2 version of block cache only. This specifies the default behavior
if a file type is not specified in the cache or nocache lists during
  reads.  Values true | false</td></tr><tr><td>blur.shard.block.cache.v2.write.cache.ext</td><td>v2
version of block cache only. This specifies what file types should be cached during writes.
 Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.write.nocache.ext
(fdt)</td><td>v2 version of block cache only. This specifies what file types should
NOT be cached during writes.  Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.write.default
(true)</td><td>v2 version of block cache only. This specifies the default behavior
if a file type is not specified in the cache or nocache lists during writes.  Values true
| false</td></tr><tr><td>blur.shard.buffercache.1024 (8192)</td><td>The
number of 1K byte buffers</td></tr><tr><td>blur.shard.buffercache.8192
(8192)</td><td>The number of 8K byte buffers</td></tr><tr><td>blur.shard.safemodedelay
(5000)</td><td>The number of milliseconds to wait for the cluster to settle once
changes have ceased</td></tr><tr><td>blur.sha
 rd.time.between.commits (30000)</td><td>The default time between index commits</td></tr><tr><td>blur.shard.time.between.refreshs
(3000)</td><td>The default time between index refreshs</td></tr><tr><td>blur.shard.merge.thread.count
(3)</td><td>The max number of threads used during index merges</td></tr><tr><td>blur.max.clause.count
(1024)</td><td>The maximum number of clauses in a BooleanQuery</td></tr><tr><td>blur.indexmanager.search.thread.count
(8)</td><td>The number of thread used for parallel searching in the index manager</td></tr><tr><td>blur.indexmanager.mutate.thread.count
(8)</td><td>The number of thread used for parallel mutating in the index manager</td></tr><tr><td>blur.shard.internal.search.thread.count
(8)</td><td>The number of threads used for parallel searching in the index searchers</td></tr><tr><td>blur.shard.warmup.thread.count
(8)</td><td>Number of threads used for warming up the index</td></tr><tr><td>blur.shard.fetchcount
(100)</td><td>The fetch count per Lucen
 e search, this fetches pointers to hits</td></tr><tr><td>blur.max.heap.per.row.fetch
(10000000)</td><td>Heap limit on row fetch, once this limit has been reached the
request will return</td></tr><tr><td>blur.max.records.per.row.fetch.request
(1000)</td><td>The maximum number of records in a single row fetch</td></tr><tr><td>blur.gui.shard.port
(40090)</td><td>The http status page port for the shard server</td></tr>
+<tr><td>blur.shard.hostname</td><td>The hostname for the shard, if
blank the hostname is automatically detected</td></tr><tr><td>blur.shard.bind.address
(0.0.0.0)</td><td>The binding address of the shard</td></tr><tr><td>blur.shard.bind.port
(40020)</td><td>The default binding port of the shard server</td></tr><tr><td>blur.shard.data.fetch.thread.count
(8)</td><td>The number of fetcher threads</td></tr><tr><td>blur.shard.server.thrift.thread.count
(8)</td><td>The number of the thrift threads</td></tr><tr><td>blur.shard.thrift.selector.threads
(2)</td><td>The number of threads used for selector processing inside the thrift
server.</td></tr><tr><td>blur.shard.thrift.max.read.buffer.bytes (9223372036854775807)</td><td>The
maximum number of bytes used for reading requests in the thrift server.</td></tr><tr><td>blur.shard.thrift.accept.queue.size.per.thread
(4)</td><td>The size of the blocking queue per selector thread for passing accepted
connections to the selector thread.</td></tr><tr
 ><td>blur.shard.opener.thread.count (8)</td><td>The number of threads
that are used for opening indexes</td></tr><tr><td>blur.shard.cache.max.querycache.elements
(128)</td><td>The number of cached queries</td></tr><tr><td>blur.shard.cache.max.timetolive
(60000)</td><td>The time to live for the cache query</td></tr><tr><td>blur.shard.filter.cache.class
(org.apache.blur.manager.DefaultBlurFilterCache)</td><td>Default implementation
of the blur cache filter, which is a pass through filter that does nothing</td></tr><tr><td>blur.shard.index.warmup.class
(org.apache.blur.manager.indexserver.DefaultBlurIndexWarmup)</td><td>Default Blur
index warmup class that warms the fields provided in the table descriptor</td></tr><tr><td>blur.shard.index.warmup.throttle
(30000000)</td><td>Throttles the warmup to 30MB/s across all the warmup threads</td></tr><tr><td>blur.shard.block.cache.version
(v2)</td><td>By default the v2 version of the block cache is enabled</td></tr><tr><td>blur.shard.block.cach
 e.total.size</td><td>By default the total amount of memory block cache will use
is -XX:MaxDirectMemorySize - 64 MiB</td></tr><tr><td>blur.shard.blockcache.direct.memory.allocation
(true)</td><td>v1 version of block cache only. By default the block cache using
off heap memory</td></tr><tr><td>blur.shard.blockcache.slab.count
(-1)</td><td>v1 version of block cache only. The slabs in the blockcache are automatically
configured by default (-1) otherwise 1 slab equals 128MB.  The auto config is detected through
the MaxDirectoryMemorySize provided to the JVM</td></tr><tr><td>blur.shard.block.cache.v2.fileBufferSize
(8192)</td><td>v2 version of block cache only. File buffer size, this is the buffer
size used to read and write to data to HDFS.  For production this will likely be increased.</td></tr><tr><td>blur.shard.block.cache.v2.cacheBlockSize
(8192)</td><td>v2 version of block cache only. The is the size of the blocks in
the off heap cache, it is good practice to have this match 'blur.s
 hard.block.cache.v2.fileBufferSize'.  For production this will likely be increased.</td></tr><tr><td>blur.shard.block.cache.v2.cacheBlockSize.filter
(33554432)</td><td>blur.shard.block.cache.v2.cacheBlockSize.<ext>=</td></tr><tr><td>blur.shard.block.cache.v2.store
(OFF_HEAP)</td><td>v2 version of block cache only. This is used to control if
the block are created on or off heap.  Values are OFF_HEAP | ON_HEAP</td></tr><tr><td>blur.shard.block.cache.v2.read.cache.ext</td><td>v2
version of block cache only. This specifies what file types should be cached during reads.
 Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.read.nocache.ext
(fdt)</td><td>v2 version of block cache only. This specifies what file types should
NOT be cached during reads.  Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.read.default
(true)</td><td>v2 version of block cache only. This specifies the default behavior
if a file type is not specified in the cache or nocache lists during
  reads.  Values true | false</td></tr><tr><td>blur.shard.block.cache.v2.write.cache.ext</td><td>v2
version of block cache only. This specifies what file types should be cached during writes.
 Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.write.nocache.ext
(fdt)</td><td>v2 version of block cache only. This specifies what file types should
NOT be cached during writes.  Comma delimited list.</td></tr><tr><td>blur.shard.block.cache.v2.write.default
(true)</td><td>v2 version of block cache only. This specifies the default behavior
if a file type is not specified in the cache or nocache lists during writes.  Values true
| false</td></tr><tr><td>blur.shard.buffercache.8192 (67108864)</td><td>The
amount of memory to be used by 8K byte buffers.  Note if you change the "blur.shard.block.cache.v2.cacheBlockSize"
or "blur.shard.block.cache.v2.fileBufferSize" you should adjust the buffer sizes as well as
the total memory allocated.  For example if you increased the "blur.shard
 .block.cache.v2.fileBufferSize" to 64K (65536) then this property should to "blur.shard.buffercache.65536".
 You can also define as many of these properties as needed.</td></tr><tr><td>blur.shard.buffercache.1024
(8388608)</td><td>The amount of memory to be used by 1K byte buffers.  Note if
you change the "blur.shard.block.cache.v2.cacheBlockSize" or "blur.shard.block.cache.v2.fileBufferSize"
you should adjust the buffer sizes as well as the total memory allocated.</td></tr><tr><td>blur.shard.safemodedelay
(5000)</td><td>The number of milliseconds to wait for the cluster to settle once
changes have ceased</td></tr><tr><td>blur.shard.time.between.commits
(30000)</td><td>The default time between index commits</td></tr><tr><td>blur.shard.time.between.refreshs
(3000)</td><td>The default time between index refreshs</td></tr><tr><td>blur.shard.merge.thread.count
(3)</td><td>The max number of threads used during index merges</td></tr><tr><td>blur.max.clause.count
(1024)</td><td>The maximum
  number of clauses in a BooleanQuery</td></tr><tr><td>blur.indexmanager.search.thread.count
(8)</td><td>The number of thread used for parallel searching in the index manager</td></tr><tr><td>blur.indexmanager.mutate.thread.count
(8)</td><td>The number of thread used for parallel mutating in the index manager</td></tr><tr><td>blur.shard.internal.search.thread.count
(8)</td><td>The number of threads used for parallel searching in the index searchers</td></tr><tr><td>blur.shard.warmup.thread.count
(8)</td><td>Number of threads used for warming up the index</td></tr><tr><td>blur.shard.fetchcount
(100)</td><td>The fetch count per Lucene search, this fetches pointers to hits</td></tr><tr><td>blur.max.heap.per.row.fetch
(10000000)</td><td>Heap limit on row fetch, once this limit has been reached the
request will return</td></tr><tr><td>blur.max.records.per.row.fetch.request
(1000)</td><td>The maximum number of records in a single row fetch</td></tr><tr><td>blur.gui.shard.port
(40090)</td><
 td>The http status page port for the shard server</td></tr><tr><td>blur.shard.filtered.server.class</td><td>To
intercept the calls made to the shard server and perform server side changes to the calls
extend org.apache.blur.server.FilteredBlurServer.</td></tr>
 			</table>
 
             <h3 id="shard-blur-env">blur-env.sh</h3>
@@ -212,10 +217,58 @@ export BLUR_SHARD_SLEEP=0.1
 # The of shard servers to spawn per machine.
 export BLUR_NUMBER_OF_SHARD_SERVER_INSTANCES_PER_MACHINE=1</code></pre>
 
-            <h3 id="block-cache">Block Cache Configuration</h3>
-            <h4>Why</h4>
-            <p>HDFS is a great filesystem for streaming large amounts data across large
scale clusters. However the random access latency is typically the same performance you would
get in reading from a local drive if the data you are trying to access is not in the operating
systems file cache. In other words every access to HDFS is similar to a local read with a
cache miss. There have been great performance boosts in HDFS over the past few years but it
still can't perform at the level that a search engine needs.</p>
-            <p>Now you might be thinking that Lucene reads from the local hard drive
and performs great, so why wouldn't HDFS perform fairly well on it's own? However most of
time the Lucene index files are cached by the operating system's file system cache. So Blur
has it's own file system cache allows it to perform low latency data look-ups against HDFS.</p>
+<h3 id="block-cache">Block Cache</h3>
+<h4>Why</h4>
+<p>HDFS is a great filesystem for streaming large amounts data across large scale clusters.
However the random access latency is typically the same performance you would get in reading
from a local drive if the data you are trying to access is not in the operating systems file
cache. In other words every access to HDFS is similar to a local read with a cache miss. There
have been great performance boosts in HDFS over the past few years but it still can't perform
at the level that a search engine needs.</p>
+<p>Now you might be thinking that Lucene reads from the local hard drive and performs
great, so why wouldn't HDFS perform fairly well on it's own? However most of time the Lucene
index files are cached by the operating system's file system cache. So Blur has it's own file
system cache allows it to perform low latency data look-ups against HDFS.</p>
+
+<h3 id="block-cache-v2">V2 Block Cache Configuration</h3>
+<h4>How</h4>
+<p>The Google <a href="http://code.google.com/p/concurrentlinkedhashmap/">concurrentlinkedhashmap</a>
library is at the center of the block cache in the shard servers.  In version 2, which is
enabled by default, the slab allocation is no longer used.  <a href="http://mail-archives.apache.org/mod_mbox/incubator-blur-dev/201310.mbox/%3CCAB6tTr0Nr2aDLc4kkHoeqiO-utwzBAhb=Ru==GMhQry4aXPjug@mail.gmail.com%3E">Here</a>
is a discussion of the motivations behind the rewrite.</p>
+
+<p>Below are the properties related to V2 of the block cache.</p>
+
+<table class="table-bordered table-striped table-condensed">
+<tr><td nowrap="1">blur.shard.block.cache.total.size</td><td>
+<p>This is used to limit the amount of off heap cache size.  By default the cache is
64MB less than the -XX:MaxDirectMemorySize,
+so if you want the block cache to use less than that amount then set this value.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.fileBufferSize</td><td>
+<p>This is the size of the buffer when accessing hdfs, by default it is set to 8K.
 However in most systems this should probably be increased to something closer to 64K.  Use
the &quot;fstune&quot; command in the shell to help figure out what the best buffer
size should be in your system.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.cacheBlockSize</td><td>
+<p>This is the size of the cache entry for any file that is NOT explicitly defined.
 Most of the time you are going to want this value to equal the &quot;blur.shard.block.cache.v2.fileBufferSize&quot;
value.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.cacheBlockSize.&lt;ext&gt;</td><td>
+<p>This is the size of the cache entry for any file that has the given extension. 
By default &quot;filter&quot; is the only file that has a none default cache block
size, it's current value is 32MB.  This means that unless file is larger than 32MB in size,
it will be stored as a single value in the cache.  For cached filters this is required for
performance during the transversal of the logical bitset stored in the file.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.store</td><td>
+<p>This property defines how the cache will be stored, by default it's off heap.  This
means that it is not accounted for in the used heap section that you can find in jconsole
or visualvm.  However you can track it's size through the &quot;top&quot; command
in the shell, MBeans in jconsole, or the metrics call via the Blur thrift API.<br/><br/>Unless
you are using a specialized JVM or are debugging problem this should remain off heap, however
if you would like to use the cache as on heap allocated blocks change this value to ON_HEAP.</p></td></tr>
+
+blur.shard.block.cache.v2.write.cache.ext=
+blur.shard.block.cache.v2.write.nocache.ext=fdt
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.read.default</td><td>
+<p>This property defines the default action to cache or not to cache the data during
a read operation.  By default this is true.  This will be the action taken if the file extension
is not found in either the &quot;blur.shard.block.cache.v2.read.cache.ext&quot; property
or the &quot;blur.shard.block.cache.v2.read.nocache.ext&quot; property.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.read.cache.ext</td><td>
+<p>This property defines a comma separated list of file extensions that are to be cached
during a read operations.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.read.nocache.ext</td><td>
+<p>This property defines a comma separated list of file extensions that are NOT to
be cached during a read operations.  If the file extension is in the &quot;blur.shard.block.cache.v2.read.cache.ext&quot;
property, it will have no effect in this list.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.write.default</td><td>
+<p>This property defines the default action to cache or not to cache the data during
a write operation.  By default this is true. This will be the action taken if the file extension
is not found in either the &quot;blur.shard.block.cache.v2.write.cache.ext&quot; property
or the &quot;blur.shard.block.cache.v2.write.nocache.ext&quot; property.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.write.cache.ext</td><td>
+<p>This property defines a comma separated list of file extensions that are to be cached
during a write operations.</p></td></tr>
+
+<tr><td nowrap="1">blur.shard.block.cache.v2.write.nocache.ext</td><td>
+<p>This property defines a comma separated list of file extensions that are NOT to
be cached during a write operations.  If the file extension is in the &quot;blur.shard.block.cache.v2.write.cache.ext&quot;
property, it will have no effect in this list.</p></td></tr>
+
+</table>
+
+            <h3 id="block-cache-v1">V1 Block Cache Configuration</h3>
             <h4>How</h4>
             <p>On shard server start-up Blur creates 1 or more block cache slabs blur.shard.blockcache.slab.count
that are each 128 MB in size. These slabs can be allocated on or off the heap blur.shard.blockcache.direct.memory.allocation.
Each slab is broken up into 16,384 blocks with each block size being 8K. Then on the heap
there is a concurrent LRU cache that tracks what blocks of what files are in which slab(s)
at what offset. So the more slabs of cache you create the more entries there will be in the
LRU thus more heap.</p>
             <h4>Configuration</h4>


Mime
View raw message