Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of xiaotao.cs.nju@gmail.com
 designates 209.85.216.66 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <1387245564923.cca79b6@Nodemailer>
References: 
 <CACUUc6C7sJcxg54frA=tDgYNLYXaiACiHXi41DWaAEKOd8fJ_g@mail.gmail.com>
	<1387245564923.cca79b6@Nodemailer>
Date: Tue, 17 Dec 2013 10:15:04 +0800
Message-ID: 
 <CACUUc6CsjFUpVLM0Evr74cDWDbR6DvZNTSXPMVNVBZ5g1tpyhQ@mail.gmail.com>
Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node?
From: Tao Xiao <xiaotao.cs.nju@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=047d7b6d9f62179f4504edb17f83

--047d7b6d9f62179f4504edb17f83
Content-Type: text/plain; charset=GB2312
Content-Transfer-Encoding: quoted-printable

yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja Chirala <rtejac@gmail.com>

> If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount
> dir, create same in hdfs.
> =A1=AA
> Sent from Mailbox <https://www.dropbox.com/mailbox> for iPad
>
>
> On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao <xiaotao.cs.nju@gmail.com>wrote=
:
>
>> Thanks very much, I suppose I know what I should do with
>>
>>
>> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B <vinayakumar.b@huawei.com=
>wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> *hadoop.tmp.dir* is not the exact configuration you are looking for
>>> spreading the disk I/O
>>>
>>>
>>>
>>> This is the default base directory ( its single directory not multiple)
>>> used in case you didn=A1=AFt configure your own directories for process=
es such
>>> as NameNode, DataNode and NodeManager.
>>>
>>>
>>>
>>> Exact configurations where you need to configure comma separated values
>>> are as follows.
>>>
>>>  *1.       **dfs.namenode.name.dir* for  namenode in *hdfs-site.xml*
>>>
>>> *2.       **dfs.datanode.data.dir *for datanode in *hdfs-site.xml*
>>>
>>> *3.       **yarn.nodemanager.local-dirs* for NodeManager in
>>> *yarn-site.xml*
>>>
>>>
>>>
>>> Please note all above configurations are for Hadoop 2.x
>>>
>>>
>>>
>>> Configure different subdirectories if you are using same disk for
>>> multiple processes.
>>>
>>>                 Ex: /hadoop/data1/dfs/data
>>>
>>>                         And
>>>
>>>                      /hadoop/data1/yarn/nm-local-dir
>>>
>>>
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Vinayakumar B
>>>
>>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com]
>>> *Sent:* 16 December 2013 14:42
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per
>>> node?
>>>
>>>
>>>
>>> Thanks.
>>>
>>> In order to spread I/O among multiple disks, should I assign a
>>> comma-separated list of directories which are located on different disk=
s to
>>> "hadoop.tmp.dir"=A3=BF
>>>
>>> for example,
>>>
>>>  <property>
>>>
>>>       <name>hadoop.tmp.dir</name>
>>>
>>>
>>> <value>/mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/h=
adoop_tmp_dir</value>
>>>
>>>  </property>
>>>
>>>
>>>
>>> 2013/12/16 Shekhar Sharma <shekhar2581@gmail.com>
>>>
>>> hadoop.tmp.dir is a directory created on local file system
>>> For example if you have set hadoop.tmp.dir property to
>>> /home/training/hadoop
>>>
>>> This directory will be created when you format the namenode by running
>>> the command
>>> hadoop namenode -format
>>>
>>> When you open this folder
>>>
>>>
>>> you will see two subfolders dfs and mapred.
>>>
>>> the /home/training/hadoop/mapred folder will be on HDFS also
>>>
>>> Hope this clears
>>> Regards,
>>> Som Shekhar Sharma
>>> +91-8197243810
>>>
>>>
>>>
>>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <drdwitte@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Make sure to also set mapred.local.dir to the same set of output
>>> > directories, this is were the intermediate key-value pairs are stored=
!
>>> >
>>> > Regards, Dieter
>>> >
>>> >
>>> > 2013/12/16 Tao Xiao <xiaotao.cs.nju@gmail.com>
>>> >>
>>> >> I have ten disks per node,and I don't know what value I should set t=
o
>>> >> "hadoop.tmp.dir". Some said this property refers to a location in
>>> local disk
>>> >> while some other said it refers to a directory in HDFS. I'm confused=
,
>>> who
>>> >> can explain it ?
>>> >>
>>> >> I want to spread I/O since I have ten disks per node, so should I se=
t
>>> a
>>> >> comma-separated list of directories (which are on different disks) t=
o
>>> >> "hadoop.tmp.dir" ?
>>> >
>>> >
>>>
>>>
>>>
>>
>>
>

--047d7b6d9f62179f4504edb17f83
Content-Type: text/html; charset=GB2312
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">yes, hadoop.tmp.dir is both local and hdfs .</div><div cla=
ss=3D"gmail_extra"><br><br><div class=3D"gmail_quote">2013/12/17 Raviteja C=
hirala <span dir=3D"ltr">&lt;<a href=3D"mailto:rtejac@gmail.com" target=3D"=
_blank">rtejac@gmail.com</a>&gt;</span><br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div>If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mou=
nt dir, create same in hdfs.&nbsp;</div>
<div>&mdash;<br>Sent from <a href=3D"https://www.dropbox.com/mailbox" targe=
t=3D"_blank">Mailbox</a> for iPad</div><div class=3D"HOEnZb"><div class=3D"=
h5">
<br><br><div class=3D"gmail_quote"><p>On Mon, Dec 16, 2013 at 5:05 PM, Tao =
Xiao <span dir=3D"ltr">&lt;<a href=3D"mailto:xiaotao.cs.nju@gmail.com" targ=
et=3D"_blank">xiaotao.cs.nju@gmail.com</a>&gt;</span> wrote:<br></p><blockq=
uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc =
solid;padding-left:1ex">

<div dir=3D"ltr">Thanks very much, I suppose I know what I should do with&n=
bsp;</div>
<div class=3D"gmail_extra">
<br><br><div class=3D"gmail_quote">On Mon, Dec 16, 2013 at 5:27 PM, Vinayak=
umar B <span dir=3D"ltr">&lt;<a href=3D"mailto:vinayakumar.b@huawei.com" ta=
rget=3D"_blank">vinayakumar.b@huawei.com</a>&gt;</span> wrote:<br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">


<div lang=3D"EN-US">
<div>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Hi,<u></u><u></u></span><=
/p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><b><span style=3D"font-size:11.0pt;font-family:&quot=
;Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d">hadoop.tmp.dir</span><=
/b><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sa=
ns-serif&quot;;color:#1f497d"> is not the exact configuration you are looki=
ng
 for spreading the disk I/O<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">This is the default base =
directory ( its single directory not multiple) used in case you didn&rsquo;=
t configure your own directories for processes such as NameNode,
 DataNode and NodeManager.<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Exact configurations wher=
e you need to configure comma separated values are as follows.<u></u><u></u=
></span></p>


<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u><u></u></span></p>
<p><u></u><b><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot=
;,&quot;sans-serif&quot;;color:#1f497d"><span>1.<span style=3D"font:7.0pt &=
quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span></b><u></u><b><span style=3D"font-size:11.0pt;font-fam=
ily:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d">dfs.namenode.=
name.dir</span></b><span style=3D"font-size:11.0pt;font-family:&quot;Calibr=
i&quot;,&quot;sans-serif&quot;;color:#1f497d"> for &nbsp;namenode in
<b>hdfs-site.xml</b><u></u><u></u></span></p>
<p><u></u><b><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot=
;,&quot;sans-serif&quot;;color:#1f497d"><span>2.<span style=3D"font:7.0pt &=
quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span></b><u></u><b><span style=3D"font-size:11.0pt;font-fam=
ily:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d">dfs.datanode.=
data.dir
</span></b><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,=
&quot;sans-serif&quot;;color:#1f497d">for datanode in
<b>hdfs-site.xml</b><u></u><u></u></span></p>
<p><u></u><b><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot=
;,&quot;sans-serif&quot;;color:#1f497d"><span>3.<span style=3D"font:7.0pt &=
quot;Times New Roman&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</span></span></span></b><u></u><b><span style=3D"font-size:11.0pt;font-fam=
ily:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d">yarn.nodemana=
ger.local-dirs</span></b><span style=3D"font-size:11.0pt;font-family:&quot;=
Calibri&quot;,&quot;sans-serif&quot;;color:#1f497d"> for NodeManager in
<b>yarn-site.xml</b><u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Please note all above con=
figurations are for Hadoop 2.x<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Configure different subdi=
rectories if you are using same disk for multiple processes.
<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ex: /hado=
op/data1/dfs/data<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;And<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbs=
p;&nbsp;&nbsp;&nbsp; /hadoop/data1/yarn/nm-local-dir<u></u><u></u></span></=
p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d"><u></u>&nbsp;<u></u></spa=
n></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Cheers,<u></u><u></u></sp=
an></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
libri&quot;,&quot;sans-serif&quot;;color:#1f497d">Vinayakumar B<u></u><u></=
u></span></p>
<div style=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm =
0cm 0cm">
<p class=3D"MsoNormal"><b><span style=3D"font-size:10.0pt;font-family:&quot=
;Tahoma&quot;,&quot;sans-serif&quot;">From:</span></b><span style=3D"font-s=
ize:10.0pt;font-family:&quot;Tahoma&quot;,&quot;sans-serif&quot;"> Tao Xiao=
 [mailto:<a href=3D"mailto:xiaotao.cs.nju@gmail.com" target=3D"_blank">xiao=
tao.cs.nju@gmail.com</a>]
<br><b>Sent:</b> 16 December 2013 14:42<br><b>To:</b> <a href=3D"mailto:use=
r@hadoop.apache.org" target=3D"_blank">user@hadoop.apache.org</a><br><b>Sub=
ject:</b> Re: How to set &quot;hadoop.tmp.dir&quot; if I have multiple disk=
s per node?<u></u><u></u></span></p>

</div>
<div><div>
<p class=3D"MsoNormal"><u></u>&nbsp;<u></u></p>
<div>
<div>
<div>
<div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><span style=3D"font-f=
amily:&quot;Verdana&quot;,&quot;sans-serif&quot;">Thanks.<br><br>
In order to spread I/O among multiple disks, should I assign a comma-separa=
ted list of directories which are located on different disks to &quot;hadoo=
p.tmp.dir&quot;</span><span lang=3D"ZH-CN">=A3=BF</span><u></u><u></u></p>
</div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><span style=3D"font-f=
amily:&quot;Verdana&quot;,&quot;sans-serif&quot;">for example,
</span><u></u><u></u></p>
</div>
<p class=3D"MsoNormal"><span style=3D"font-family:&quot;Verdana&quot;,&quot=
;sans-serif&quot;">&nbsp;&lt;property&gt;</span><u></u><u></u></p>
</div>
<div>
<p class=3D"MsoNormal"><span style=3D"font-family:&quot;Verdana&quot;,&quot=
;sans-serif&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;name&gt;hadoop.tmp.di=
r&lt;/name&gt;</span><u></u><u></u></p>
</div>
<div>
<p class=3D"MsoNormal"><span style=3D"font-family:&quot;Verdana&quot;,&quot=
;sans-serif&quot;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;value&gt;/mnt/disk1/h=
adoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir&lt;/value=
&gt;</span><u></u><u></u></p>


</div>
<p class=3D"MsoNormal"><span style=3D"font-family:&quot;Verdana&quot;,&quot=
;sans-serif&quot;">&nbsp;&lt;/property&gt;</span><u></u><u></u></p>
</div>
<div>
<p class=3D"MsoNormal" style=3D"margin-bottom:12.0pt"><u></u>&nbsp;<u></u><=
/p>
<div>
<p class=3D"MsoNormal">2013/12/16 Shekhar Sharma &lt;<a href=3D"mailto:shek=
har2581@gmail.com" target=3D"_blank">shekhar2581@gmail.com</a>&gt;<u></u><u=
></u></p>
<p class=3D"MsoNormal">hadoop.tmp.dir is a directory created on local file =
system<br>
For example if you have set hadoop.tmp.dir property to /home/training/hadoo=
p<br><br>
This directory will be created when you format the namenode by running<br>
the command<br>
hadoop namenode -format<br><br>
When you open this folder<br><br><br>
you will see two subfolders dfs and mapred.<br><br>
the /home/training/hadoop/mapred folder will be on HDFS also<br><br>
Hope this clears<br>
Regards,<br>
Som Shekhar Sharma<br>
+91-8197243810<u></u><u></u></p>
<div>
<div>
<p class=3D"MsoNormal"><br><br>
On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte &lt;<a href=3D"mailto:drdw=
itte@gmail.com" target=3D"_blank">drdwitte@gmail.com</a>&gt; wrote:<br>
&gt; Hi,<br>
&gt;<br>
&gt; Make sure to also set mapred.local.dir to the same set of output<br>
&gt; directories, this is were the intermediate key-value pairs are stored!=
<br>
&gt;<br>
&gt; Regards, Dieter<br>
&gt;<br>
&gt;<br>
&gt; 2013/12/16 Tao Xiao &lt;<a href=3D"mailto:xiaotao.cs.nju@gmail.com" ta=
rget=3D"_blank">xiaotao.cs.nju@gmail.com</a>&gt;<br>
&gt;&gt;<br>
&gt;&gt; I have ten disks per node,and I don&#39;t know what value I should=
 set to<br>
&gt;&gt; &quot;hadoop.tmp.dir&quot;. Some said this property refers to a lo=
cation in local disk<br>
&gt;&gt; while some other said it refers to a directory in HDFS. I&#39;m co=
nfused, who<br>
&gt;&gt; can explain it ?<br>
&gt;&gt;<br>
&gt;&gt; I want to spread I/O since I have ten disks per node, so should I =
set a<br>
&gt;&gt; comma-separated list of directories (which are on different disks)=
 to<br>
&gt;&gt; &quot;hadoop.tmp.dir&quot; ?<br>
&gt;<br>
&gt;<u></u><u></u></p>
</div>
</div>
</div>
<p class=3D"MsoNormal"><u></u>&nbsp;<u></u></p>
</div>
</div></div>
</div>
</div>

</blockquote>
</div>
<br></div>
</blockquote></div><br></div></div></blockquote></div><br></div>

--047d7b6d9f62179f4504edb17f83--