hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Assigning regions after restart
Date Wed, 15 Mar 2017 05:14:01 GMT
File a blocker please Lars. I'm pretty sure the boolean on whether we are
doing a recovery or not has been there a long time so yeah, a single server
recovery could throw us off, but you make a practical point, that one
server should not destroy locality over the cluster.

St.Ack

On Tue, Mar 14, 2017 at 7:09 AM, Lars George <lars.george@gmail.com> wrote:

> Wait, HBASE-15251 is not enough methinks. The checks added help, but
> are not covering all the possible edge cases. In particular, say a
> node really fails, why not just reassign the few regions it did hold
> and leave all the others where they are? Seems insane as it is.
>
> On Tue, Mar 14, 2017 at 2:24 PM, Lars George <lars.george@gmail.com>
> wrote:
> > Doh, https://issues.apache.org/jira/browse/HBASE-15251 addresses this
> > (though I am not sure exactly how, see below). This should be
> > backported to all 1.x branches!
> >
> > As for the patch, I see this
> >
> >      if (!failover) {
> >        // Fresh cluster startup.
> > -      LOG.info("Clean cluster startup. Assigning user regions");
> > +      LOG.info("Clean cluster startup. Don't reassign user regions");
> >        assignAllUserRegions(allRegions);
> > +    } else {
> > +      LOG.info("Failover! Reassign user regions");
> >      }
> >
> > Would be interesting to see how that log message is actually
> > reassigning the user regions. I would have assumed the
> > "assignAllUserRegions()" would have to be called.
> >
> >
> > On Tue, Mar 14, 2017 at 2:21 PM, Lars George <lars.george@gmail.com>
> wrote:
> >> Looking at the code more... it seems the issue is here
> >>
> >> In AssignmentManager.processDeadServersAndRegionsInTransition():
> >>
> >> ...
> >> failoverCleanupDone();
> >> if (!failover) {
> >>   // Fresh cluster startup.
> >>   LOG.info("Clean cluster startup. Assigning user regions");
> >>   assignAllUserRegions(allRegions);
> >> }
> >> ...
> >>
> >> As soon as a single node failed, failover is set to true. So no
> >> assignment is done. Later on the servers are recovered from the
> >> "crash" (which a clean restart is as well), and then all unassigned
> >> regions (all of them now, since they are unassigned earlier) are
> >> assigned round-robin.
> >>
> >>
> >>
> >> On Tue, Mar 14, 2017 at 1:54 PM, Lars George <lars.george@gmail.com>
> wrote:
> >>> Hi,
> >>>
> >>> I had this happened at multiple clusters recently where after the
> >>> restart the locality dropped from close to or exactly 100% down to
> >>> single digits. The reason is that all regions were completely shuffled
> >>> and reassigned to random servers. Upon reading the (yet again
> >>> non-trivial) assignment code, I found that a single server missing
> >>> will trigger a full "recovery" of all servers, which includes a drop
> >>> of all previous assignments and random new assignment.
> >>>
> >>> This is just terrible! In addition, I also assumed that - at least the
> >>> StochasticLoadBalancer - is checking which node holds most of the data
> >>> of a region locality wise and picks that server. But that is _not_ the
> >>> case! It just spreads everything seemingly randomly across the
> >>> servers.
> >>>
> >>> To me this is a big regression (or straight bug) given that a single
> >>> server out of, for example, hundreds could trigger that and destroy
> >>> the locality completely. Running a major compaction is not an approach
> >>> for many reasons.
> >>>
> >>> This used to work better, why that regression?
> >>>
> >>> Lars
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message