Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C5C8118F8 for ; Tue, 26 Apr 2011 03:24:43 +0000 (UTC) Received: (qmail 85464 invoked by uid 500); 26 Apr 2011 03:24:43 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 85405 invoked by uid 500); 26 Apr 2011 03:24:43 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 85396 invoked by uid 99); 26 Apr 2011 03:24:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 03:24:42 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2011 03:24:40 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 1F448B306E for ; Tue, 26 Apr 2011 03:24:03 +0000 (UTC) Date: Tue, 26 Apr 2011 03:24:03 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: <103401245.1350.1303788243124.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <239238470.1167.1303780563338.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-3820) Splitlog() executed while the namenode was in safemode may cause data-loss MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025039#comment-13025039 ] stack commented on HBASE-3820: ------------------------------ @Jieshan Bean Please add a patch (See 'attach file' above) with your change only in it. See http://wiki.apache.org/hadoop/Hbase/HowToContribute for how to make a patch if you are unclear. Thank you. > Splitlog() executed while the namenode was in safemode may cause data-loss > -------------------------------------------------------------------------- > > Key: HBASE-3820 > URL: https://issues.apache.org/jira/browse/HBASE-3820 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.90.2 > Reporter: Jieshan Bean > > I found this problem while the namenode went into safemode due to some unclear reasons. > There's one patch about this problem: > try { > HLogSplitter splitter = HLogSplitter.createLogSplitter( > conf, rootdir, logDir, oldLogDir, this.fs); > try { > splitter.splitLog(); > } catch (OrphanHLogAfterSplitException e) { > LOG.warn("Retrying splitting because of:", e); > // An HLogSplitter instance can only be used once. Get new instance. > splitter = HLogSplitter.createLogSplitter(conf, rootdir, logDir, > oldLogDir, this.fs); > splitter.splitLog(); > } > splitTime = splitter.getTime(); > splitLogSize = splitter.getSize(); > } catch (IOException e) { > checkFileSystem(); > LOG.error("Failed splitting " + logDir.toString(), e); > master.abort("Shutting down HBase cluster: Failed splitting hlog files...", e); > } finally { > this.splitLogLock.unlock(); > } > And it was really give some useful help to some extent, while the namenode process exited or been killed, but not considered the Namenode safemode exception. > I think the root reason is the method of checkFileSystem(). > It gives out an method to check whether the HDFS works normally(Read and write could be success), and that maybe the original propose of this method. This's how this method implements: > DistributedFileSystem dfs = (DistributedFileSystem) fs; > try { > if (dfs.exists(new Path("/"))) { > return; > } > } catch (IOException e) { > exception = RemoteExceptionHandler.checkIOException(e); > } > > I have check the hdfs code, and learned that while the namenode was in safemode ,the dfs.exists(new Path("/")) returned true. Because the file system could provide read-only service. So this method just checks the dfs whether could be read. I think it's not reasonable. > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira