Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 182F992FB for ; Fri, 6 Apr 2012 04:09:46 +0000 (UTC) Received: (qmail 68502 invoked by uid 500); 6 Apr 2012 04:09:45 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 68437 invoked by uid 500); 6 Apr 2012 04:09:45 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 68218 invoked by uid 99); 6 Apr 2012 04:09:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Apr 2012 04:09:44 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Apr 2012 04:09:41 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id E321835B13F for ; Fri, 6 Apr 2012 04:09:20 +0000 (UTC) Date: Fri, 6 Apr 2012 04:09:20 +0000 (UTC) From: "Zhihong Yu (Updated) (JIRA)" To: issues@hbase.apache.org Message-ID: <1006374931.21570.1333685360931.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1801746982.178.1333169923283.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HBASE-5689) Skipping RecoveredEdits may cause data loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5689: ------------------------------ Attachment: 5689-v4.txt Patch v4 removes Math.abs() call. > Skipping RecoveredEdits may cause data loss > ------------------------------------------- > > Key: HBASE-5689 > URL: https://issues.apache.org/jira/browse/HBASE-5689 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.94.0 > Reporter: chunhui shen > Assignee: chunhui shen > Priority: Critical > Fix For: 0.94.0 > > Attachments: 5689-testcase.patch, 5689-v4.txt, HBASE-5689.patch, HBASE-5689.patch, HBASE-5689v2.patch, HBASE-5689v3.patch > > > Let's see the following scenario: > 1.Region is on the server A > 2.put KV(r1->v1) to the region > 3.move region from server A to server B > 4.put KV(r2->v2) to the region > 5.move region from server B to server A > 6.put KV(r3->v3) to the region > 7.kill -9 server B and start it > 8.kill -9 server A and start it > 9.scan the region, we could only get two KV(r1->v1,r2->v2), the third KV(r3->v3) is lost. > Let's analyse the upper scenario from the code: > 1.the edit logs of KV(r1->v1) and KV(r3->v3) are both recorded in the same hlog file on server A. > 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. > 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. > 3.however, RecoveredEdits file f2 will be skiped when initializing region > HRegion#replayRecoveredEditsIfAny > {code} > for (Path edits: files) { > if (edits == null || !this.fs.exists(edits)) { > LOG.warn("Null or non-existent edits file: " + edits); > continue; > } > if (isZeroLengthThenDelete(this.fs, edits)) continue; > if (checkSafeToSkip) { > Path higher = files.higher(edits); > long maxSeqId = Long.MAX_VALUE; > if (higher != null) { > // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: "-?[0-9]+" > String fileName = higher.getName(); > maxSeqId = Math.abs(Long.parseLong(fileName)); > } > if (maxSeqId <= minSeqId) { > String msg = "Maximum possible sequenceid for this log is " + maxSeqId > + ", skipped the whole file, path=" + edits; > LOG.debug(msg); > continue; > } else { > checkSafeToSkip = false; > } > } > {code} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira