accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: 0% Data Cache Hit Rate
Date Tue, 12 Mar 2013 18:03:36 GMT
On Tue, Mar 12, 2013 at 11:46 AM, Slater, David M.
<David.Slater@jhuapl.edu> wrote:
> Thanks Keith,
>
> I checked, and the values were all default. (cache block disabled)
>
> Turning them on, however, turned the data cache hit rate down to single digits for all
of the data nodes. I'm guessing that the queries I am running, since they need to go through
so much data, cannot be cached well, and that the high percentages I was getting before were
due to the use metatable data cache (since that is enabled by default).

That sounds correct.  Did you up the cache size?

>
> Since data caching is disabled by default, I assume that there are downsides to using
it. Is this primarily memory footprint?

Yeah, primarily memory.   Being able to enable/disable it allows you
to decide which tables you want to use that memory.

>
> Regards,
> David
>
> -----Original Message-----
> From: Keith Turner [mailto:keith@deenlo.com]
> Sent: Monday, March 11, 2013 3:02 PM
> To: user@accumulo.apache.org
> Subject: Re: 0% Data Cache Hit Rate
>
> You may need to set the following property to true for the table.
> This enables caching data for a table.  It defaults to false.
>
> table.cache.block.enable
>
> Also take a look at the following props.  These determine how much memory a tserver uses
for caching.
>
> tserver.cache.data.size
> tserver.cache.index.size
>
> The following props enables caching rfile indexes for a table, it defaults to true.
>
> table.cache.index.enable
>
> Keith
>
>
> On Mon, Mar 11, 2013 at 2:48 PM, Slater, David M.
> <David.Slater@jhuapl.edu> wrote:
>> Hi,
>>
>>
>>
>> I have a four-node setup, and I'm running some intensive query
>> operations that need to go through all of the rows (though only one or
>> two column families). While I don't expect this to be fast by any
>> means, I wanted to make sure that I had a decent baseline before
>> comparing this to more indexed versions of querying. Here is the
>> problem: Two of my nodes have very low data cache hit rates, wand I
>> assume that this would greatly impact the query efficiency. Is this correct?
>>
>>
>>
>> All four of my nodes have a 99% index cache hit rate, but the data
>> cache hit rates are:
>>
>> Node 1: 96%
>>
>> Node 2: 95%
>>
>> Node 3: 67%
>>
>> Node 4: 0%
>>
>> (All four are data nodes; the name node is #1)
>>
>>
>>
>> I'm not seeing any warnings or errors in the logs, and I couldn't find
>> much online about it, so I thought I would check here. Does anyone
>> have a suggestion as for how to fix it? Could this be related to the
>> system swappiness at all? (I currently have swappiness set to 0.)
>>
>>
>>
>> Thanks for the help,
>> David Slater

Mime
View raw message