Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F228BD58B for ; Fri, 5 Oct 2012 17:21:54 +0000 (UTC) Received: (qmail 88837 invoked by uid 500); 5 Oct 2012 17:21:50 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 88591 invoked by uid 500); 5 Oct 2012 17:21:50 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 88584 invoked by uid 99); 5 Oct 2012 17:21:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Oct 2012 17:21:50 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of athomewithagroovebox@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qa0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Oct 2012 17:21:43 +0000 Received: by mail-qa0-f48.google.com with SMTP id c11so502569qad.14 for ; Fri, 05 Oct 2012 10:21:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=5XM+mGZbDsL/THHAMw7qlmw0In2eYtw1L48X0UXx1CI=; b=JYUxvuQHE9RawyQUnFOE1I7n+B45PS9BxjymKKVDPux6aOGARUARJsU0ZPOQsv9FVI T8NvOLt+Ltb3oGNY19F39LICCOpu7E1qZXI8XHHF/nRtrsP8D/ZzBfxP7LQyS5bnR3Sz THr7o+1gmI89V4b9UTiRxgcLV62I+jic0syC3ToJ5dbpGvPixorwXZNOZD4nrd/+6n1+ 4/13C43BoocTaYCjEgNg9uAihCxqhktj4Gept1tp3OVQYUoW8YoZ9W0xzFmuu7P0f6ai uJmsYr/p9IVyIGqAkd6gKr36BqVnaTp5f0b5ckzJks49D9zI85zKLwK3XDXkQLhapyd1 DTJQ== Received: by 10.224.193.193 with SMTP id dv1mr18415876qab.29.1349457683095; Fri, 05 Oct 2012 10:21:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.49.35.140 with HTTP; Fri, 5 Oct 2012 10:21:02 -0700 (PDT) From: jeremy p Date: Fri, 5 Oct 2012 10:21:02 -0700 Message-ID: Subject: When running Hadoop in pseudo-distributed mode, what directory should I use for hadoop.tmp.dir? To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf300fb0ddd71f8104cb5319fd X-Virus-Checked: Checked by ClamAV on apache.org --20cf300fb0ddd71f8104cb5319fd Content-Type: text/plain; charset=ISO-8859-1 By default, Hadoop sets hadoop.tmp.dir to your /tmp folder. This is a problem, because /tmp gets wiped out by Linux when you reboot, leading to this lovely error from the JobTracker : 2012-10-05 07:41:13,618 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s). ... 2012-10-05 07:41:22,636 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s). 2012-10-05 07:41:22,643 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: null java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:767) The only way I've found to fix this is to reformat your name node, which rebuilds the /tmp/hadoop-root folder, which of course gets wiped out again when you reboot. So I went ahead and created a folder called /hadoop_temp and gave all users read/write access to it. I then set this property in my core-site.xml : hadoop.tmp.dir file:///hadoop_temp When I re-formatted my namenode, Hadoop seemed happy, giving me this message : 12/10/05 07:58:54 INFO common.Storage: Storage directory file:/hadoop_temp/dfs/name has been successfully formatted. However, when I looked at /hadoop_temp, I noticed that the folder was empty. And then when I restarted Hadoop and checked my JobTracker log, I saw this : 2012-10-05 08:02:41,988 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s). ... 2012-10-05 08:02:51,010 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s). 2012-10-05 08:02:51,011 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: null java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.ConnectException: Connection refused And when I checked my namenode log, I saw this : 2012-10-05 08:00:31,206 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /opt/hadoop/hadoop-0.20.2/file:/hadoop_temp/dfs/name does not exist. 2012-10-05 08:00:31,212 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /opt/hadoop/hadoop-0.20.2/file:/hadoop_temp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. So, clearly I didn't configure something right. Hadoop still expects to see its files in the /tmp folder even though I set hadoop.tmp.dir to /hadoop_temp in core-site.xml. What did I do wrong? What's the accepted "right" value for hadoop.tmp.dir? Bonus question : what should I use for hbase.tmp.dir? System info : Ubuntu 12.04, Apache Hadoop .20.2, Apache HBase .92.1 Thanks for taking a look! --Jeremy --20cf300fb0ddd71f8104cb5319fd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
By default, Hadoop sets hadoop.tmp.dir to your /tmp folder. This is a = problem, because /tmp gets wiped out by Linux when you reboot, leading to t= his lovely error from the JobTracker :

2012-10-05 = 07:41:13,618 INFO org.apache.hadoop.ipc.Client: Retrying connect to server:= localhost/127.0.0.1:8020. Already tr= ied 0 time(s). =A0 =A0
... =A0 =A0
2012-10-05 07:41:22,636 INFO org.apache.hadoop.i= pc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 9 time(s).
2012-10-05 0= 7:41:22,643 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning syst= em directory: null
java.net.ConnectException: Call to localhost/127.0.0.1:8020 failed on connection exception: java.net.Connec= tException: Connection refused
=A0 =A0 at org.apache.hadoop.ipc.C= lient.wrapException(Client.java:767) =A0 =A0
The only way I've found to fix this is to reformat your name node,= which rebuilds the /tmp/hadoop-root folder, which of course gets wiped out= again when you reboot.

So I went ahead and create= d a folder called /hadoop_temp and gave all users read/write access to it. = I then set this property in my core-site.xml :

<property>
<name>hadoop.tmp.dir&l= t;/name>
<value>file:///hadoop_temp</value>
<= div></property>
=A0
When I re-formatted my nameno= de, Hadoop seemed happy, giving me this message :

12/10/05 07:58:54 INFO common.Storage: Storage director= y file:/hadoop_temp/dfs/name has been successfully formatted.
How= ever, when I looked at /hadoop_temp, I noticed that the folder was empty. A= nd then when I restarted Hadoop and checked my JobTracker log, I saw this :=

2012-10-05 08:02:41,988 INFO org.apache.hadoop.ipc.Clie= nt: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).
...
2012-= 10-05 08:02:51,010 INFO org.apache.hadoop.ipc.Client: Retrying connect to s= erver: localhost/127.0.0.1:8020. Alre= ady tried 9 time(s).
2012-10-05 08:02:51,011 INFO org.apache.hadoop.mapred.JobTracker: prob= lem cleaning system directory: null
java.net.ConnectException: Ca= ll to localhost/127.0.0.1:8020 failed= on connection exception: java.net.ConnectException: Connection refused
And when I checked my namenode log, I saw this :

<= div>2012-10-05 08:00:31,206 INFO org.apache.hadoop.hdfs.server.common.Stora= ge: Storage directory /opt/hadoop/hadoop-0.20.2/file:/hadoop_temp/dfs/name = does not exist.
2012-10-05 08:00:31,212 ERROR org.apache.hadoop.hdfs.server.namenode.F= SNamesystem: FSNamesystem initialization failed.
org.apache.hadoo= p.hdfs.server.common.InconsistentFSStateException: Directory /opt/hadoop/ha= doop-0.20.2/file:/hadoop_temp/dfs/name is in an inconsistent state: storage= directory does not exist or is not accessible.
So, clearly I didn't configure something right. Hadoop still expec= ts to see its files in the /tmp folder even though I set hadoop.tmp.dir to = /hadoop_temp in core-site.xml. What did I do wrong? What's the accepted= "right" value for hadoop.tmp.dir?

Bonus question : what should I use for hbase.tmp.dir?

System info :

Ubuntu 12.04= , Apache Hadoop .20.2, Apache HBase .92.1

Thanks f= or taking a look!

--Jeremy
--20cf300fb0ddd71f8104cb5319fd--