hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koontz (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-4302) Precondition in EditLogFileInputStream's length() method is checked too early in NameNode startup, causing fatal exception
Date Tue, 11 Dec 2012 19:45:21 GMT
Eugene Koontz created HDFS-4302:
-----------------------------------

             Summary: Precondition in EditLogFileInputStream's length() method is checked
too early in NameNode startup, causing fatal exception
                 Key: HDFS-4302
                 URL: https://issues.apache.org/jira/browse/HDFS-4302
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: ha, namenode
            Reporter: Eugene Koontz


When bringing up a namenode in standby mode, where DEBUG is enabled for namenode, the namenode
will hit the following code in {{hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java}}:

{code}
 if (LOG.isDebugEnabled()) {
      LOG.debug("edit log length: " + in.length() + ", start txid: "
          + expectedStartingTxId + ", last txid: " + lastTxId);
    }
{code}.

However, if {{in}} has an {{EditLogFileInputStream}} as its {{streams[0]}}, this code is hit
before the {{EditLogFileInputStream}}'s {{advertizedSize}} is initialized (before the HTTP
client connects to the remote edit log server (i.e. the journal node)). This causes the following
precondition to fail in {{EditLogFileInputStream:length()}}:

{code}
      Preconditions.checkState(advertisedSize != -1,
          "must get input stream before length is available");
{code}

which shuts down the namenode with the following log messages and stack trace:

{code}
2012-12-11 10:45:33,319 DEBUG ipc.ProtobufRpcEngine (ProtobufRpcEngine.java:invoke(217)) -
Call: getEditLogManifest took 88ms
2012-12-11 10:45:33,336 DEBUG client.QuorumJournalManager (QuorumJournalManager.java:selectInputStreams(459))
- selectInputStream manifests:
172.16.175.1:8485: [[1,3]]
2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(605)) - Planning
to load image :
FSImageFile(file=/tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000, cpktTxId=0000000000000000000)
2012-12-11 10:45:33,351 DEBUG namenode.FSImage (FSImage.java:loadFSImage(607)) - Planning
to load edit log stream: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(168)) - Loading image
file /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000 using no compression
2012-12-11 10:45:33,355 INFO  namenode.FSImage (FSImageFormat.java:load(171)) - Number of
files = 1
2012-12-11 10:45:33,356 INFO  namenode.FSImage (FSImageFormat.java:loadFilesUnderConstruction(383))
- Number of files under construction = 0
2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImageFormat.java:load(193)) - Image file
of size 119 loaded in 0 seconds.
2012-12-11 10:45:33,357 INFO  namenode.FSImage (FSImage.java:loadFSImage(753)) - Loaded image
for txid 0 from /tmp/hadoop-data/dfs/name/current/fsimage_0000000000000000000
2012-12-11 10:45:33,357 DEBUG namenode.FSImage (FSImage.java:loadEdits(686)) - About to load
edits:
  org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
2012-12-11 10:45:33,359 INFO  namenode.FSImage (FSImage.java:loadEdits(694)) - Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@465098f9
expecting start txid #1
2012-12-11 10:45:33,361 DEBUG ipc.Client (Client.java:stop(1060)) - Stopping client
2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:close(1016)) - IPC Client (1462017562)
connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM:
closed
2012-12-11 10:45:33,363 DEBUG ipc.Client (Client.java:run(848)) - IPC Client (1462017562)
connection to Eugenes-MacBook-Pro.local/172.16.175.1:8485 from hdfs/eugenes-macbook-pro.local@EXAMPLE.COM:
stopped, remaining connections 0
2012-12-11 10:45:33,464 FATAL namenode.NameNode (NameNode.java:main(1224)) - Exception in
namenode join
java.lang.IllegalStateException: must get input stream before length is available
        at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.length(EditLogFileInputStream.java:405)
        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.length(EditLogFileInputStream.java:258)
        at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
        at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.length(RedundantEditLogInputStream.java:256)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:125)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:88)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:697)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:642)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:259)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:604)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:447)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:409)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:400)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:434)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:606)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:591)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1219)
2012-12-11 10:45:33,470 INFO  util.ExitUtil (ExitUtil.java:terminate(84)) - Exiting with status
1
2012-12-11 10:45:33,471 INFO  namenode.NameNode (StringUtils.java:run(620)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at Eugenes-MacBook-Pro.local/172.16.175.1
************************************************************/
{code}




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message