accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: Concern regarding cache (ACCUMULO-3549 and ACCUMULO-3547) in Apache Accumulo 1.6.2 RC3
Date Sat, 31 Jan 2015 20:41:34 GMT
We're working ACCUMULO-3549, and a pretty conservative fix will be
committed Monday.


On Sat, Jan 31, 2015 at 12:48 PM, Josh Elser <josh.elser@gmail.com> wrote:

> That's a good point, Ed, and there hasn't been any other discussion (on
> the mailing lists) so you did the right thing bringing this up here.
>
> There is no user administration or monitoring support that would allow
> user intervention (aside from restarting a tserver which is a no-go). If
> we're going to include it, like it appears so, we need to both make sure
> that the cache is bounded in size and we have as many people as possible
> look at it (since it's such a late addition to the release -- it's common
> for us to only notice subtleties weeks to months after a change is made
> during normal development cycles).
>
>
> Ed Coleman wrote:
>
>> Eric commented on the vote for RC3:
>>
>> - - - -
>> It would be nice to have ACCUMULO-3547<https://issues.
>> apache.org/jira/browse/ACCUMULO-3549>  in 1.6.2.
>>
>> We are running at scale with it at the moment, and it has made a huge
>> improvement.  I hate to hold up 1.6.2, though.  If it doesn't make it,
>> please update the ticket to point to 1.6.3.
>> - - - -
>>
>> I generally agree with this and it seems that ACCUMULO-3547 will make it
>> into 1.6.2 - which I think is the preferable option. My concerns deal with
>> not having ACCUMULO-3549 included in 1.6.2 too.
>>
>> In ACCUMULO-3549 Keith made the assumption that end rows are 10 bytes -
>> I'm not sure this is a good assumption. If end rows are larger than 10
>> bytes, then how much more memory will be required over time? How much
>> faster will it grow?
>>
>> Without ACCUMULO-3549, what are my options for monitoring / correcting
>> the situation if the cache grows too large? Will tablet server performance
>> slowly degrade over time because the cache keeps growing?  What will users
>> need to do to monitor and then correct this? Will we be in a situation
>> where tserevrs will start to run out of memory, we will increase the memory
>> allocation if we can, and just kick the can down the road a little further
>> and performance will just keep degrading?
>>
>> Is there a way to trigger the cache to clear short of restarting a
>> tserver? While not optimal, having a utility / script that slowly walks
>> across the tservers and clears the cache so that each tserver cache is
>> cleared every 12, 24, 48,... hours may be a bridge until ACCUMULO-3549 is
>> resolved. If this is the case, it would seem that having the fix in 1.6.3
>> would also be a priority.
>>
>> Maybe this has been discussed and resolved, but I want to bring this up
>> to ensure that the ramifications have been considered and that there is a
>> viable mitigation strategy that is communicated to the users. Sorry for the
>> doom - end of the world tone I was just trying to emphasis the worst case
>> scenarios that I could envision. I think ACCUMULO-3547 is an important
>> (even necessary improvement) and I'm not suggesting that it be removed - I
>> just want to make sure that I understand the other side effects and know
>> our options.
>>
>> Ed Coleman
>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message