accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ke...@deenlo.com
Subject Re: Review Request 24855: ACCUMULO-1454 design doc
Date Tue, 19 Aug 2014 19:41:22 GMT


> On Aug. 19, 2014, 6:31 p.m., Josh Elser wrote:
> > One big design concern I have is what gains the final solution would actually have
over what is currently possible with Accumulo as it stands.
> > 
> > Right now, you can force tablets to migrate by stopping a tserver. This goes back
through the balancer, so you have a bit of churn in however many "rounds" the Balancer takes
to choose where those tablets should go, and then for the master to process the necessary
assignments for each tserver. How I'm seeing it described is that the only piece of the puzzle
that we're making better is removing the migration components in favor of letting the user
control this directly. How much does a "smart" Balancer implementation close the gap between
the user providing migrations in regards to performance? Also, how does removing the Balancer
from the equation change the wall time to get a tablet assigned (is it significant)?
> > 
> > We have to also understand that while we can decompose the problem into some simple
primitives, I believe this approach is still a rather difficult distributed state problem
that I'm worried is being over-architected. My $0.02.
> 
> Josh Elser wrote:
>     For context, I was reading about HBase's support on the subject and found http://hbase.apache.org/book/node.management.html.
Their general approach is to provide a graceful shutdown for regionservers. This is still
subject to problems in mass amounts of servers being stopped at one time. To alleviate some
of this pain, they use ZK to store what servers are currently in a "draining state" to avoid
new assignments to those nodes -- "[...] decommissioning mulitple nodes may be non-optimal
because regions that are being drained from one region server may be moved to other regionservers
that are also draining. Marking RegionServers to be in the draining state prevents this from
happening",
> 
> kturner wrote:
>     An alternative to this design, is one that Mike mentioned on the issue.   Temporarily
replace the balancer.  I am thinking that providing these primitves for manipulating tablets
will allow an administrator to quickly script a one off solution to a problem, in addition
to solving the rolling restart problem.  You do not get this quick flexibility with writing
a new balancer.
>     
>     Killing tablet servers is a solution.  I think it would be nice to have a solution
that avoids log recovery, minimizes down time of individual tablets, preserves locality, and
is easy to use.  It does not have to be this solution.  W/o additional scripts, the primary
use case in 1454 would not be easy to use.   A balancer alone would not be enough to achieve
the goal of migrating tablets between old and new tservers on the same node.  However a balancer
+ tservers states like you mentioned from HBAse may provide enough.  Should probably try to
explore the balancer option a bit more.

One other thing I was thinking about was that you can not make assumptions about the environment.
 Users may not use the Accumulo scripts to start and stop tservers.


- kturner


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24855/#review51006
-----------------------------------------------------------


On Aug. 19, 2014, 5:50 p.m., kturner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24855/
> -----------------------------------------------------------
> 
> (Updated Aug. 19, 2014, 5:50 p.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-1454
>     https://issues.apache.org/jira/browse/ACCUMULO-1454
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Positing ACCUMULO-1454 design doc for review
> 
> 
> Diffs
> -----
> 
>   docs/src/main/asciidoc/design/ACCUMULO-1454-proposal-01.adoc PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24855/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> kturner
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message