accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Concern regarding cache (ACCUMULO-3549 and ACCUMULO-3547) in Apache Accumulo 1.6.2 RC3
Date Sat, 31 Jan 2015 17:48:10 GMT
That's a good point, Ed, and there hasn't been any other discussion (on 
the mailing lists) so you did the right thing bringing this up here.

There is no user administration or monitoring support that would allow 
user intervention (aside from restarting a tserver which is a no-go). If 
we're going to include it, like it appears so, we need to both make sure 
that the cache is bounded in size and we have as many people as possible 
look at it (since it's such a late addition to the release -- it's 
common for us to only notice subtleties weeks to months after a change 
is made during normal development cycles).

Ed Coleman wrote:
> Eric commented on the vote for RC3:
>
> - - - -
> It would be nice to have ACCUMULO-3547<https://issues.apache.org/jira/browse/ACCUMULO-3549>
 in 1.6.2.
>
> We are running at scale with it at the moment, and it has made a huge improvement.  I
hate to hold up 1.6.2, though.  If it doesn't make it, please update the ticket to point to
1.6.3.
> - - - -
>
> I generally agree with this and it seems that ACCUMULO-3547 will make it into 1.6.2 -
which I think is the preferable option. My concerns deal with not having ACCUMULO-3549 included
in 1.6.2 too.
>
> In ACCUMULO-3549 Keith made the assumption that end rows are 10 bytes - I'm not sure
this is a good assumption. If end rows are larger than 10 bytes, then how much more memory
will be required over time? How much faster will it grow?
>
> Without ACCUMULO-3549, what are my options for monitoring / correcting the situation
if the cache grows too large? Will tablet server performance slowly degrade over time because
the cache keeps growing?  What will users need to do to monitor and then correct this? Will
we be in a situation where tserevrs will start to run out of memory, we will increase the
memory allocation if we can, and just kick the can down the road a little further and performance
will just keep degrading?
>
> Is there a way to trigger the cache to clear short of restarting a tserver? While not
optimal, having a utility / script that slowly walks across the tservers and clears the cache
so that each tserver cache is cleared every 12, 24, 48,... hours may be a bridge until ACCUMULO-3549
is resolved. If this is the case, it would seem that having the fix in 1.6.3 would also be
a priority.
>
> Maybe this has been discussed and resolved, but I want to bring this up to ensure that
the ramifications have been considered and that there is a viable mitigation strategy that
is communicated to the users. Sorry for the doom - end of the world tone I was just trying
to emphasis the worst case scenarios that I could envision. I think ACCUMULO-3547 is an important
(even necessary improvement) and I'm not suggesting that it be removed - I just want to make
sure that I understand the other side effects and know our options.
>
> Ed Coleman
>
>

Mime
View raw message