Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 23249 invoked from network); 11 May 2007 20:52:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 May 2007 20:52:37 -0000 Received: (qmail 11512 invoked by uid 500); 11 May 2007 20:52:43 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 11348 invoked by uid 500); 11 May 2007 20:52:42 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 11339 invoked by uid 99); 11 May 2007 20:52:42 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 May 2007 13:52:42 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 May 2007 13:52:35 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8AACC714065 for ; Fri, 11 May 2007 13:52:15 -0700 (PDT) Message-ID: <12365371.1178916735565.JavaMail.jira@brutus> Date: Fri, 11 May 2007 13:52:15 -0700 (PDT) From: "Owen O'Malley (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1242) dfs upgrade/downgrade problems In-Reply-To: <9779640.1176239252561.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495159 ] Owen O'Malley commented on HADOOP-1242: --------------------------------------- > 1. By design the storage file is an indicator that the old layout is present. > The order is important since the recovery is based on that order. The "storage" file is a bad indicator precisely because the pre-13 versions of Hadoop's datanodes create it without question. The "data" directory is a much much better indicator, because it is _not_ created automatically. This is far too common a case to allow Hadoop's automatic upgrade to corrupt you data node's directory. > I think this works as designed. It may work the way that you intended it, but it is really bad from a usability standpoint. My change isn't perfect, but it handles it much better and you haven't provided any use cases where it is worse. > You made a mistake, the software detected inconsistency and warned you. Which is fine, except it also corrupted the repository such that I had to make hand edits to each node in the cluster to fix the problem. That is not ok. > I do not understand what is expected here. I do not understand what is it blocking. What is expected is that if you try to bring up a version 12 data node on a version 13 data node directory it will fail. However, when you fix the problem and use version 13 again, it must come up without a problem. Making the administrator log into every single node to delete a file is not ok. > dfs upgrade/downgrade problems > ------------------------------ > > Key: HADOOP-1242 > URL: https://issues.apache.org/jira/browse/HADOOP-1242 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.13.0 > Reporter: Owen O'Malley > Assigned To: dhruba borthakur > Priority: Blocker > Fix For: 0.13.0 > > Attachments: clean-upgrade.patch > > > I ran my test cluster on 0.13 and then tried to run it under 0.12. When I downgraded, the namenode would not come up and the message said I needed to format the filesystem. I ignored that and tried to restart on 0.13, now the datanode will not come up with: > 2007-04-10 11:25:37,448 ERROR org.apache.hadoop.dfs.DataNode: org.apache.hadoop. > dfs.InconsistentFSStateException: Directory /local/owen/hadoop/dfs/d > ata is in an inconsistent state: Old layout block directory /local/owen/hadoop/dfs/data/data is missing > at org.apache.hadoop.dfs.DataStorage.isConversionNeeded(DataStorage.java > :170) > at org.apache.hadoop.dfs.Storage$StorageDirectory.analyzeStorage(Storage > .java:264) > at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.j > ava:83) > at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:230) > at org.apache.hadoop.dfs.DataNode.(DataNode.java:199) > at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:1175) > at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1119) > at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:1140) > at org.apache.hadoop.dfs.DataNode.main(DataNode.java:1299) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.