Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C921710CE8 for ; Fri, 21 Jun 2013 13:10:24 +0000 (UTC) Received: (qmail 82022 invoked by uid 500); 21 Jun 2013 13:10:22 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 81683 invoked by uid 500); 21 Jun 2013 13:10:21 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 81598 invoked by uid 99); 21 Jun 2013 13:10:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jun 2013 13:10:20 +0000 Date: Fri, 21 Jun 2013 13:10:20 +0000 (UTC) From: "Matteo Bertozzi (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-8783) RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Matteo Bertozzi created HBASE-8783: -------------------------------------- Summary: RSSnapshotManager.ZKProcedureMemberRpcs may be initialized with the wrong server name Key: HBASE-8783 URL: https://issues.apache.org/jira/browse/HBASE-8783 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.95.1, 0.94.8 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.95.2, 0.94.9 Attachments: HBASE-8783-0.94-v0.patch The ZKProcedureMemberRpcs of the RegionServerSnapshotManager may be initialized with the wrong memberName. {code} 2013-06-21 05:03:41,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: Initialize Snapshot Manager ... 2013-06-21 05:03:41,875 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname to use. Was=0.0.0.0, Now=srv-5.test.cloudera.com {code} The Region Server Name is used as memberName, but since the snapshot manger is initialized before the RS receives the server name used by the master, the zkprocedure will use the wrong name (0.0.0.0). This will case the snapshot to fail with a TimeoutException since the master will not receive the expected RS {code} Master: ZKProcedureCoordinatorRpcs: Watching for acquire node:/hbase/online-snapshot/acquired/foo23/srv-5.test.cloudera.com,60020,1371813451915 RS: ZKProcedureMemberRpcs: Member: '0.0.0.0,60020,1371814996779' joining acquired barrier for procedure (foo23) in zk ... org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused Foreign Exception Start:1371798732141, End:1371798792141, diff:60000, max:60000 ms {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira