Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 87BD676A4 for ; Thu, 6 Oct 2011 21:11:55 +0000 (UTC) Received: (qmail 4545 invoked by uid 500); 6 Oct 2011 21:11:55 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 4488 invoked by uid 500); 6 Oct 2011 21:11:55 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 4472 invoked by uid 99); 6 Oct 2011 21:11:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Oct 2011 21:11:55 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Oct 2011 21:11:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 2F5292ADEFF for ; Thu, 6 Oct 2011 21:11:30 +0000 (UTC) Date: Thu, 6 Oct 2011 21:11:30 +0000 (UTC) From: "stack (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1325623101.5241.1317935490195.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1202197790.5010.1314652417727.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4282) Potential data loss in retries of WAL close introduced in HBASE-4222 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122297#comment-13122297 ] stack commented on HBASE-4282: ------------------------------ On v3, the txids are pretty useless at least out in logs? No harm logging them I suppose but there is nothing I can infer given a txid? Is that so? Why this: {code} - if (unflushedEntries.get() <= syncedTillHere) { - Thread.sleep(this.optionalFlushInterval); - } + Thread.sleep(this.optionalFlushInterval); {code} Swap these lines on commit? {code} + TEST_UTIL.cleanupTestDir(); + TEST_UTIL.shutdownMiniCluster(); {code} This is a good thing to assert: {code} + assertTrue("Need HDFS-826 for this test", log.canGetCurReplicas()); {code} A similar assertion over in TestLogRolling found an issue in 205 RC1. Nice test > Potential data loss in retries of WAL close introduced in HBASE-4222 > -------------------------------------------------------------------- > > Key: HBASE-4282 > URL: https://issues.apache.org/jira/browse/HBASE-4282 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.0, 0.94.0, 0.90.5 > Reporter: Gary Helmling > Assignee: Gary Helmling > Priority: Blocker > Fix For: 0.92.0, 0.90.5 > > Attachments: HBASE-4282_0.90_2.patch, HBASE-4282_trunk_2.patch, HBASE-4282_trunk_3.patch, HBASE-4282_trunk_prelim.patch > > > The ability to ride over WAL close errors on log rolling added in HBASE-4222 could lead to missing HLog entries if: > * A table has DEFERRED_LOG_FLUSH=true > * There are unflushed WALEdit entries for that table in the current SequenceFile writer buffer > Since the writes were already acknowledged to the client, just ignoring the close error to allow for another log roll doesn't seem like the right thing to do here. > We could easily flag this state and only ride over the close error if there aren't unflushed entries. This would bring the above condition back to the previous behavior of aborting the region server. However, aborting the region server in this state is still guaranteeing data loss. Is there anything we can do better in this case? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira