Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 21401 invoked from network); 3 Dec 2007 18:06:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Dec 2007 18:06:57 -0000 Received: (qmail 74737 invoked by uid 500); 3 Dec 2007 18:06:38 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 74703 invoked by uid 500); 3 Dec 2007 18:06:38 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 74694 invoked by uid 99); 3 Dec 2007 18:06:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2007 10:06:38 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.28.243.194] (HELO kobol.telespree.com) (208.28.243.194) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2007 18:06:15 +0000 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C835D7.337B539B" Subject: DFS Datanodes are suddenly "not formatted" Date: Mon, 3 Dec 2007 10:06:17 -0800 Message-ID: <3ABD2E90F39FFA4E9C56F17C265F6153246D33@kobol.telespree.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: DFS Datanodes are suddenly "not formatted" Thread-Index: Acg11zNXOtspWnyxT7Wq2Jl2s870+Q== From: "Michael Harris" To: X-Virus-Checked: Checked by ClamAV on apache.org ------_=_NextPart_001_01C835D7.337B539B Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I have a problem with the datanodes. I shutdown DFS and Mapred on Friday for my cluster and then when I started them up on Monday it remained in safe mode listing two of the datanodes with no blocks. Then when I checked the logs on the datanodes the log said that the data directory was not formatted. It preceded to format them and I suppose erased all blocks stored there. I did not have a high enough replication factor for both of these to go down so my DFS was ruined. Is this because the datanodes are storing data in the tmp directory? ... If so how can I change that directory? =20 2007-12-03 09:24:08,299 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=3DDataNode, sessionId=3Dnull 2007-12-03 09:24:08,395 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 1 time(s). 2007-12-03 09:24:09,453 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 2 time(s). 2007-12-03 09:24:10,476 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 3 time(s). 2007-12-03 09:24:11,478 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 4 time(s). 2007-12-03 09:24:12,683 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 5 time(s). 2007-12-03 09:24:13,716 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 6 time(s). 2007-12-03 09:24:14,806 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 7 time(s). 2007-12-03 09:24:15,855 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 8 time(s). 2007-12-03 09:24:16,916 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 9 time(s). 2007-12-03 09:24:18,295 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 10 time(s). 2007-12-03 09:24:19,298 INFO org.apache.hadoop.ipc.RPC: Server at mh0.telespree.com/172.18.1.80:54310 not available yet, Zzzzz... 2007-12-03 09:24:20,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 1 time(s). 2007-12-03 09:24:21,403 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 2 time(s). 2007-12-03 09:24:22,431 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 3 time(s). 2007-12-03 09:24:23,515 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 4 time(s). 2007-12-03 09:24:24,544 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 5 time(s). 2007-12-03 09:24:26,065 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 6 time(s). 2007-12-03 09:24:27,068 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 7 time(s). 2007-12-03 09:24:28,230 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 8 time(s). 2007-12-03 09:24:29,411 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 9 time(s). 2007-12-03 09:24:30,431 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 10 time(s). 2007-12-03 09:24:31,504 INFO org.apache.hadoop.ipc.RPC: Server at mh0.telespree.com/172.18.1.80:54310 not available yet, Zzzzz... 2007-12-03 09:24:32,508 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 1 time(s). 2007-12-03 09:24:49,604 INFO org.apache.hadoop.dfs.Storage: Storage directory /tmp/hadoop-hadoop/dfs/data is not formatted. 2007-12-03 09:24:49,604 INFO org.apache.hadoop.dfs.Storage: Formatting ... 2007-12-03 09:24:52,741 INFO org.apache.hadoop.dfs.DataNode: Opened server at 50010 2007-12-03 09:24:52,794 INFO org.mortbay.util.Credential: Checking Resource aliases 2007-12-03 09:24:52,827 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4 2007-12-03 09:24:53,086 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@1f3ce5c 2007-12-03 09:24:53,116 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/] 2007-12-03 09:24:53,117 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs] 2007-12-03 09:24:53,117 INFO org.mortbay.util.Container: Started HttpContext[/static,/static] 2007-12-03 09:24:53,118 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50075 2007-12-03 09:24:53,118 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@ee22f7 2007-12-03 09:24:53,148 INFO org.apache.hadoop.dfs.DataNode: New storage id DS-1588572895-172.18.2.23-50010-1196702693143 is assigned to data-node 172.18.2.23:50010 2007-12-03 09:24:53,149 INFO org.apache.hadoop.dfs.DataNode: In DataNode.run, data =3D FSDataset{dirpath=3D'/tmp/hadoop-hadoop/dfs/data/current'} 2007-12-03 09:24:53,149 INFO org.apache.hadoop.dfs.DataNode: using BLOCKREPORT_INTERVAL of 3463518msec 2007-12-03 09:31:23,420 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: mh0.telespree.com/172.18.1.80:54310. Already tried 1 time(s). 2007-12-03 09:31:23,447 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: =20 Thanks, Michael ------_=_NextPart_001_01C835D7.337B539B--