Return-Path: Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: (qmail 44400 invoked from network); 21 May 2010 22:08:38 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 May 2010 22:08:38 -0000 Received: (qmail 50401 invoked by uid 500); 21 May 2010 22:08:38 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 50364 invoked by uid 500); 21 May 2010 22:08:38 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 50356 invoked by uid 500); 21 May 2010 22:08:37 -0000 Delivered-To: apmail-hadoop-hbase-dev@hadoop.apache.org Received: (qmail 50353 invoked by uid 99); 21 May 2010 22:08:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 May 2010 22:08:37 +0000 X-ASF-Spam-Status: No, hits=-1454.1 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 May 2010 22:08:37 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o4LM8G19019209 for ; Fri, 21 May 2010 22:08:17 GMT Message-ID: <27533195.29361274479696955.JavaMail.jira@thor> Date: Fri, 21 May 2010 18:08:16 -0400 (EDT) From: "Nicolas Spiegelberg (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Resolved: (HBASE-2593) Race Between Log Splitting and Log Writing In-Reply-To: <30239324.27881274474896518.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg resolved HBASE-2593. ---------------------------------------- Resolution: Duplicate IIRC tlipcon: you can't append after you rename, cuz the lease follows through the rename [2:40pm] tlipcon: it still holds the lease [2:40pm] tlipcon: changeLease(src, dst, dinfo); // update lease with new filename nspiegelberg: hmm. I think [we were] of the opinion that this caused a kill lease rpc. looks like that's wrong [2:46pm] nspiegelberg: or rather, that the rename didn't persist the lease and append would then cause a kill lease tlipcon: I think most importantly, it means that on a restart or lease expiry, the NN side will have the correct record to find that thing in the directory and finalize it, etc [2:52pm] nspiegelberg: I think that was [the] intention. Hlog.new = no new files, rename() = no new block, append() = wait for RS to give up tlipcon: nspiegelberg: what was proposed before was: list files, find last file, open for append. when that succeeds, list again, see if there any new ones tlipcon: (this relies on writer not closing the prior file until the next one has been created) tlipcon: but I think we should collapse the two tickets - seems like the same fix will fix all the problems nspiegelberg: I think they're solving the same problem. Just trying to think if we can make an active lease takeover vs current passive approaches [3:02pm] tlipcon: nspiegelberg: yea, I don't know that there is any active lease takeover opportunity really > Race Between Log Splitting and Log Writing > ------------------------------------------ > > Key: HBASE-2593 > URL: https://issues.apache.org/jira/browse/HBASE-2593 > Project: Hadoop HBase > Issue Type: Bug > Components: master, regionserver > Affects Versions: 0.21.0 > Reporter: Nicolas Spiegelberg > Assignee: Nicolas Spiegelberg > Priority: Critical > Fix For: 0.21.0 > > > The current method for recovering the lease in HLog.splitLog() is flawed. Between the time that the regionserver is marked as dead and fs.append is issued, the regionserver could exit a GC pause and maintain the lease. In this case, fs.append() would continually fail. The master needs to not only recover the lease in splitLog but also break the lease so regionserver writes will no longer pass. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.