Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 54576DB06 for ; Tue, 11 Dec 2012 19:47:24 +0000 (UTC) Received: (qmail 78225 invoked by uid 500); 11 Dec 2012 19:47:23 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 78175 invoked by uid 500); 11 Dec 2012 19:47:23 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 78127 invoked by uid 99); 11 Dec 2012 19:47:23 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Dec 2012 19:47:23 +0000 Date: Tue, 11 Dec 2012 19:47:23 +0000 (UTC) From: "Eugene Koontz (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HDFS-4302: -------------------------------- Status: Patch Available (was: Open) > Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception > -------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-4302 > URL: https://issues.apache.org/jira/browse/HDFS-4302 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, namenode > Reporter: Eugene Koontz > > When bringing up a namenode in standby mode, where DEBUG is enabled for namenode, the namenode will hit the following code in {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}: > {code} > if (LOG.isDebugEnabled()) { > LOG.debug("edit log length: " + in.length() + ", start txid: " > + expectedStartingTxId + ", last txid: " + lastTxId); > } > {code}. > However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, this code is hit before the {{EditLogFileInputStream}}'s {{advertizedSize}} is initialized (before the HTTP client connects to the remote edit log server (i.e. the journal node)). This causes the following precondition to fail in {{EditLogFileInputStream:length()}}: > {code} > Preconditions.checkState(advertisedSize != -1, > "must get input stream before length is available"); > {code} > which shuts down the namenode with the following log messages and stack trace: > {code} > 2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) - Call: getEditLogManifest took 88ms > 2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager (QuorumJournalManager.java:selectInputStreams(459)) - selectInputStream manifests: > 172.16.175.1:8485: [[1,3]] > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(605)) - Planning to load image : > FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000, cpktTxId=0000000000000000000) > 2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(607)) - Planning to load edit log stream: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(168)) - Loading image file /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000 using no compression > 2012-12-11 10:45:33,355 INFO namenode.FSImage (FSImageFormat.java:load(171)) - Number of files = 1 > 2012-12-11 10:45:33,356 INFO namenode.FSImage (FSImageFormat.java:loadFilesUnderConstruction(383)) - Number of files under construction = 0 > 2012-12-11 10:45:33,357 INFO namenode.FSImage (FSImageFormat.java:load(193)) - Image file of size 119 loaded in 0 seconds. > 2012-12-11 10:45:33,357 INFO namenode.FSImage (FSImage.java:loadFSImage(753)) - Loaded image for txid 0 from /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000 > 2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) - About to load edits: > org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 > 2012-12-11 10:45:33,359 INFO namenode.FSImage (FSImage.java:loadEdits(694)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9 expecting start txid #1 > 2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping client > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM: closed > 2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client (1462017562) connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM: stopped, remaining connections 0 > 2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - Exception in namenode join > java.lang.IllegalStateException: must get input stream before length is available > at com.google.common.base.Preconditions.checkState(Preconditions.java:145) > at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.length(EditLogFileInputStream.java:405) > at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.length(EditLogFileInputStream.java:258) > at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256) > at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256) > at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:125) > at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:88) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:697) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:642) > at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:259) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:604) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:447) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:409) > at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:400) > at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:606) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:591) > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1219) > 2012-12-11 10:45:33,470 INFO util.ExitUtil (ExitUtil.java:terminate(84)) - Exiting with status 1 > 2012-12-11 10:45:33,471 INFO namenode.NameNode (StringUtils.java:run(620)) - SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NameNode at Eugenes-MacBook-Pro.local/172.16.175.1 > ************************************************************/ > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira