Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C3145F63 for ; Tue, 10 May 2011 22:50:30 +0000 (UTC) Received: (qmail 90298 invoked by uid 500); 10 May 2011 22:50:30 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 90261 invoked by uid 500); 10 May 2011 22:50:30 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 90251 invoked by uid 99); 10 May 2011 22:50:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 22:50:30 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 22:50:27 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 7633F44E10 for ; Tue, 10 May 2011 22:49:47 +0000 (UTC) Date: Tue, 10 May 2011 22:49:47 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1565496612.1588.1305067787481.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1470180369.33029.1304966286151.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-1904) Secondary Namenode dies when a mkdir on a non-existent parent directory is run MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1904: ------------------------------ Description: Steps to reproduce: 1. Configure secondary namenode with {{fs.checkpoint.period}} set to a small value (eg 3 seconds) 2. Format filesystem and start HDFS 3. hadoop fs -mkdir /foo/bar ; sleep 5 ; echo | hadoop fs -put - /foo/bar/baz 2NN will crash with the following trace on the next checkpoint. The primary NN also crashes on next restart was: Steps to reproduce: 1. I pulled trunk using git. The last git commit were For hadoop-common commit bbd8581a905aa734015efb3a0366b33639f4c16f Author: Tsz-wo Sze Date: Fri May 6 22:03:13 2011 +0000 Remove the empty file accidentally checked it with HADOOP-7249. git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1100400 13f79535-47bb-0310-9956-ffa450edef68 For hadoop-hdfs commit 1ca9d6518fe1341ca4082ef61ea40d2daa215ee7 Author: Todd Lipcon Date: Sun May 8 20:43:24 2011 +0000 HDFS-1866. Document dfs.datanode.max.transfer.threads in hdfs-default.xml. Contributed by Harsh J Chouraria. git-svn-id: https://svn.apache.org/repos/asf/hadoop/hdfs/trunk@1100811 13f79535-47bb-0310-9956-ffa450edef68 2. Built using ant mvn-install. Setup three directories in dfs.name.dir. Formatted namenode. Started using start-dfs.sh 3. [ravihadoop@localhost hadoop]$ hdfs dfs -ls / # Initially the HDFS filesystem is empty [ravihadoop@localhost hadoop]$ hdfs dfs -mkdir /home/ravihadoop # /home here doesn't exist. But mkdir doesn't complain [ravihadoop@localhost hadoop]$ hdfs dfs -ls / Found 1 items drwxr-xr-x - ravihadoop supergroup 0 2011-05-09 12:24 /home [ravihadoop@localhost hadoop]$ hdfs dfs -ls /home Found 1 items drwxr-xr-x - ravihadoop supergroup 0 2011-05-09 12:24 /home/ravihadoop [ravihadoop@localhost hadoop]$ hdfs dfs -put ~/test.sh /home/ravihadoop/test.sh [ravihadoop@localhost hadoop]$ The last command makes the Secondary namenode keel over and die with this exception: 2011-05-09 12:25:03,611 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 32-bit 2011-05-09 12:25:03,611 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 19.26 MB 2011-05-09 12:25:03,611 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^22 = 4194304 entries 2011-05-09 12:25:03,611 INFO org.apache.hadoop.hdfs.util.GSet: recommended=4194304, actual=4194304 2011-05-09 12:25:03,750 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=ravihadoop 2011-05-09 12:25:03,750 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2011-05-09 12:25:03,750 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2011-05-09 12:25:03,750 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=1000 2011-05-09 12:25:03,750 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isBlockTokenEnabled=false blockKeyUpdateInterval=0 min(s), blockTokenLifetime=0 min(s) 2011-05-09 12:25:03,751 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2011-05-09 12:25:03,755 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Throwable Exception in doCheckpoint: 2011-05-09 12:25:03,755 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException: Panic: parent does not exist at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1693) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1707) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1544) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:288) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:234) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:116) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:62) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:723) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:720) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$500(SecondaryNameNode.java:610) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:487) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:448) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:312) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:276) at java.lang.Thread.run(Thread.java:619) 2011-05-09 12:25:03,756 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down SecondaryNameNode at localhost.localdomain/192.168.1.4 ************************************************************/ Affects Version/s: (was: 0.22.0) 0.23.0 > Secondary Namenode dies when a mkdir on a non-existent parent directory is run > ------------------------------------------------------------------------------ > > Key: HDFS-1904 > URL: https://issues.apache.org/jira/browse/HDFS-1904 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.23.0 > Environment: Linux > Reporter: Ravi Prakash > Priority: Blocker > > Steps to reproduce: > 1. Configure secondary namenode with {{fs.checkpoint.period}} set to a small value (eg 3 seconds) > 2. Format filesystem and start HDFS > 3. hadoop fs -mkdir /foo/bar ; sleep 5 ; echo | hadoop fs -put - /foo/bar/baz > 2NN will crash with the following trace on the next checkpoint. The primary NN also crashes on next restart -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira