hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name
Date Fri, 21 Jun 2013 13:10:20 GMT
Matteo Bertozzi created HBASE-8783:
--------------------------------------

             Summary: RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the
wrong server name
                 Key: HBASE-8783
                 URL: https://issues.apache.org/jira/browse/HBASE-8783
             Project: HBase
          Issue Type: Bug
          Components: snapshots
    Affects Versions: 0.95.1, 0.94.8
            Reporter: Matteo Bertozzi
            Assignee: Matteo Bertozzi
            Priority: Minor
             Fix For: 0.95.2, 0.94.9
         Attachments: HBASE-8783-0.94-v0.patch

The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong
memberName.

{code}
2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize
Snapshot Manager
...
2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed
us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com
{code}

The Region Server Name is used as memberName, but since the snapshot manger is initialized
before the RS receives the server name used by the master, the zkprocedure will use the wrong
name (0.0.0.0). 
This will case the snapshot to fail with a TimeoutException since the master will not receive
the expected RS
{code}
Master:
ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915

RS:
ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for
procedure (foo23) in zk

...
org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused
Foreign Exception Start:1371798732141, End:1371798792141, diff:60000, max:60000 ms
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message