hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3724) Namenode does not start due to exception throw while saving Image
Date Wed, 09 Jul 2008 02:22:31 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Konstantin Shvachko updated HADOOP-3724:
----------------------------------------

      Description: 
Re-start of namenode failed with this stack trace while savingImage during initialization

{noformat}
2008-07-09 00:20:21,470 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
2008-07-09 00:20:21,493 ERROR org.apache.hadoop.dfs.NameNode: java.io.IOException: saveLeases
found path /foo/bar/jambajuice but no matching entry in namespace.  
at org.apache.hadoop.dfs.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4376) 

at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:874)  
at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:892)  
at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:81)   
at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:273)   
at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:252)   
at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)   
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)   
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)   
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:822)  
at org.apache.hadoop.dfs.NameNode.main(NameNode.java:831)
{noformat}

Looks like it was throwing IOException in saveFilesUnderConstruction

Before restart NameNode was killed while some jobs were running. Upon looking at the namenode
log before the stopping of namenode, there were many entries like this 

{noformat}
2008-07-09 00:12:55,301 INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease. 
Holder: DFSClient_-510679348, pendingcreates: 1], src=/foo/bar/jambajuice
2008-07-09 00:12:55,301 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate:
attempt to release a create lock on /foo/bar/jambajuice  file does not exist.
{noformat}

These 2 lines are repeated forever every second, to a point where I see that a 7 node cluster
had namenode log with size close to 41G.

Could not find any other information about the file as there were not previous namenode logs.



  was:
Re-start of namenode failed with this stack trace while savingImage during initialization

{noformat}
2008-07-09 00:20:21,470 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
2008-07-09 00:20:21,493 ERROR org.apache.hadoop.dfs.NameNode: java.io.IOException: saveLeases
found path /foo/bar/jambajuice but no matching entry in namespace.  
at org.apache.hadoop.dfs.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4376) 

at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:874)  
at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:892)  
at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:81)   
at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:273)   
at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:252)   
at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)   
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)   
at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)   
at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:822)  
at org.apache.hadoop.dfs.NameNode.main(NameNode.java:831)
{noformat}

Looks like it was throwing IOException in saveFilesUnderConstruction

Before restart NameNode was killed while some jobs were running. Upon looking at the namenode
log before the stopping of namenode, there were many entries like this 

{noformat}
2008-07-09 00:12:55,301 INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease. 
Holder: DFSClient_-510679348, pendingcreates: 1], src=/foo/bar/jambajuice
2008-07-09 00:12:55,301 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate:
attempt to release a create lock on /foo/bar/jambajuice  file does not exist.
{noformat}

These 2 lines repeated forever, to a point where I see that a 7 node cluster had namenode
log with size close to 41G.

Could not find any other information about the file as there were not previous namenode logs.



         Priority: Blocker  (was: Major)
    Fix Version/s: 0.18.0
         Assignee: dhruba borthakur

It looks like there are 2 problems here related to *LEASE RECOVERY*.
# The system falls into infinite loop trying to recover a lease
# making the namespace image inconsistent so that the name-node cannot not re-start.

Both of them are critical for the 0.18 release.

> Namenode does not start due to exception throw while saving Image
> -----------------------------------------------------------------
>
>                 Key: HADOOP-3724
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3724
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Lohit Vijayarenu
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.18.0
>
>
> Re-start of namenode failed with this stack trace while savingImage during initialization
> {noformat}
> 2008-07-09 00:20:21,470 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
> 2008-07-09 00:20:21,493 ERROR org.apache.hadoop.dfs.NameNode: java.io.IOException: saveLeases
found path /foo/bar/jambajuice but no matching entry in namespace.  
> at org.apache.hadoop.dfs.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4376)
 
> at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:874)  
> at org.apache.hadoop.dfs.FSImage.saveFSImage(FSImage.java:892)  
> at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:81)   
> at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:273)   
> at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:252)   
> at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:148)   
> at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:193)   
> at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:179)   
> at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:822)  
> at org.apache.hadoop.dfs.NameNode.main(NameNode.java:831)
> {noformat}
> Looks like it was throwing IOException in saveFilesUnderConstruction
> Before restart NameNode was killed while some jobs were running. Upon looking at the
namenode log before the stopping of namenode, there were many entries like this 
> {noformat}
> 2008-07-09 00:12:55,301 INFO org.apache.hadoop.fs.FSNamesystem: Recovering lease=[Lease.
 Holder: DFSClient_-510679348, pendingcreates: 1], src=/foo/bar/jambajuice
> 2008-07-09 00:12:55,301 WARN org.apache.hadoop.dfs.StateChange: DIR* NameSystem.internalReleaseCreate:
attempt to release a create lock on /foo/bar/jambajuice  file does not exist.
> {noformat}
> These 2 lines are repeated forever every second, to a point where I see that a 7 node
cluster had namenode log with size close to 41G.
> Could not find any other information about the file as there were not previous namenode
logs. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message