hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: DNS mismatch between master and regionserver causes doubly registered regionservers
Date Fri, 22 May 2015 19:23:12 GMT
Thank you Esteban.  I checked two different versions:

- hbase-1.0.0-cdh5.4.0 (this is the version I use)
- hbase-1.1.0.1 (just wanted to check the latest release)

On Fri, May 22, 2015 at 3:13 PM, Esteban Gutierrez <esteban@cloudera.com>
wrote:

> Hi Bryan,
>
> could you please be more specific about the 1.x version that you are using?
> we have  HBASE-13481 and HBASE-12954 so it depends on which version of 1.x
> you are using.
>
> Regarding your account issue, I have created an INFRA JIRA on your behalf
> to look into your account problem.
>
> thanks,
> esteban.
>
>
>
> --
> Cloudera, Inc.
>
>
> On Fri, May 22, 2015 at 10:17 AM, Bryan Beaudreault <
> bbeaudreault@hubspot.com> wrote:
>
> > In our system each server has 2 dns associated with it, one always points
> > to a private address and the other to public or private depending on the
> > context.
> >
> > This issue did not show up in 0.94.x, but is showing up on my new 1.x
> > cluster.  Basically it goes like this:
> >
> > 1. Regionserver starts up, get's its hostname which returns
> > `hostA.external` due to our /etc/hosts
> > 2. Regionserver registers itself in zookeeper as `hostA.external`
> > 3. Regionserver reports for duty in to HMaster, which re-resolves the DNS
> > and returns `hostA.internal`.
> > 4. HMaster registers server as `hostA.internal`
> > 5. Regionserver receives the RegionServerStartupResponse, which contains
> > `hostA.internal` and uses that for its RPCs
> > 6. HMaster sees a ZNode with `hostA.external`, so thinks it is a
> > regionserver that hasn't checked in yet, and registers it.
> >
> > So I think the problem is that step #2 happens before step #5.  You can
> > clearly see this in the HRegionServer.java run() function.
> >
> > In 0.94, the `createMyEphemeralNode` function was called within
> > `handleReportForDutyResponse`.  In 1.x, it happens within `run()` BEFORE
> > `handleReportForDutyResponse`.
> >
> >
> > I can work around this by handling /etc/hosts specially for my
> > regionservers.  We have our /etc/hosts file set up like this for a
> reason,
> > but I think I can special case regionservers.
> >
> > However, it seems like a bug that there are mechanisms built in for the
> > HMaster to determine the RegionServer hostname, but that these mechanisms
> > do not account for doubly-registered regionservers due to zookeeper and
> > hmaster mismatch.
> >
> > I tried to create a JIRA for this, but either my username no longer has
> > permissions for creating, or I can't find the place to create them
> > anymore.  Any help?
> > https://issues.apache.org/jira/secure/ViewProfile.jspa?name=bbeaudreault
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message