Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 28C4D107D9 for ; Tue, 17 Dec 2013 02:15:34 +0000 (UTC) Received: (qmail 79573 invoked by uid 500); 17 Dec 2013 02:15:29 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 79441 invoked by uid 500); 17 Dec 2013 02:15:29 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 79434 invoked by uid 99); 17 Dec 2013 02:15:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Dec 2013 02:15:29 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URIBL_RHS_DOB X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xiaotao.cs.nju@gmail.com designates 209.85.216.66 as permitted sender) Received: from [209.85.216.66] (HELO mail-qa0-f66.google.com) (209.85.216.66) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Dec 2013 02:15:25 +0000 Received: by mail-qa0-f66.google.com with SMTP id f11so860281qae.9 for ; Mon, 16 Dec 2013 18:15:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5KEaIjILEVztpU2UrKXroHTBRrWFPt7U4l/Tl3SkZD4=; b=ofWTzDzZ9p0O7qpItrVR/F2NM86udAiSONBcY82vZoXvgxOzdTbM3hLaG8yKuzL3zC OC5LcyU6bXAqjv1cJzKOnMoe42IS7pXvbDfglclxQv5q61/HpcWdiO0X7lei8vmEfWZF ur5c3yymHASuMEzIkteZltCc/cND+gx6k/mqAraXgxwZkOdKxspj4M1P5//1YPdzYdoz FKGPjaZl8qssOf0kXfazJtan15SFRAchQHPRAUn28c35s9Co1VVMd7enew33B8ijtodZ yvL6xsDGmO+7a6w40gvbv/rvFbWRKGLPP0gnj9Fn3kRIoD3aMOel/iQV3WWDE+W5Cz+m rDhg== MIME-Version: 1.0 X-Received: by 10.49.82.130 with SMTP id i2mr38706223qey.68.1387246504090; Mon, 16 Dec 2013 18:15:04 -0800 (PST) Received: by 10.96.198.97 with HTTP; Mon, 16 Dec 2013 18:15:04 -0800 (PST) In-Reply-To: <1387245564923.cca79b6@Nodemailer> References: <1387245564923.cca79b6@Nodemailer> Date: Tue, 17 Dec 2013 10:15:04 +0800 Message-ID: Subject: Re: How to set "hadoop.tmp.dir" if I have multiple disks per node? From: Tao Xiao To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b6d9f62179f4504edb17f83 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6d9f62179f4504edb17f83 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: quoted-printable yes, hadoop.tmp.dir is both local and hdfs . 2013/12/17 Raviteja Chirala > If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mount > dir, create same in hdfs. > =A1=AA > Sent from Mailbox for iPad > > > On Mon, Dec 16, 2013 at 5:05 PM, Tao Xiao wrote= : > >> Thanks very much, I suppose I know what I should do with >> >> >> On Mon, Dec 16, 2013 at 5:27 PM, Vinayakumar B wrote: >> >>> Hi, >>> >>> >>> >>> *hadoop.tmp.dir* is not the exact configuration you are looking for >>> spreading the disk I/O >>> >>> >>> >>> This is the default base directory ( its single directory not multiple) >>> used in case you didn=A1=AFt configure your own directories for process= es such >>> as NameNode, DataNode and NodeManager. >>> >>> >>> >>> Exact configurations where you need to configure comma separated values >>> are as follows. >>> >>> *1. **dfs.namenode.name.dir* for namenode in *hdfs-site.xml* >>> >>> *2. **dfs.datanode.data.dir *for datanode in *hdfs-site.xml* >>> >>> *3. **yarn.nodemanager.local-dirs* for NodeManager in >>> *yarn-site.xml* >>> >>> >>> >>> Please note all above configurations are for Hadoop 2.x >>> >>> >>> >>> Configure different subdirectories if you are using same disk for >>> multiple processes. >>> >>> Ex: /hadoop/data1/dfs/data >>> >>> And >>> >>> /hadoop/data1/yarn/nm-local-dir >>> >>> >>> >>> >>> >>> Cheers, >>> >>> Vinayakumar B >>> >>> *From:* Tao Xiao [mailto:xiaotao.cs.nju@gmail.com] >>> *Sent:* 16 December 2013 14:42 >>> *To:* user@hadoop.apache.org >>> *Subject:* Re: How to set "hadoop.tmp.dir" if I have multiple disks per >>> node? >>> >>> >>> >>> Thanks. >>> >>> In order to spread I/O among multiple disks, should I assign a >>> comma-separated list of directories which are located on different disk= s to >>> "hadoop.tmp.dir"=A3=BF >>> >>> for example, >>> >>> >>> >>> hadoop.tmp.dir >>> >>> >>> /mnt/disk1/hadoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/h= adoop_tmp_dir >>> >>> >>> >>> >>> >>> 2013/12/16 Shekhar Sharma >>> >>> hadoop.tmp.dir is a directory created on local file system >>> For example if you have set hadoop.tmp.dir property to >>> /home/training/hadoop >>> >>> This directory will be created when you format the namenode by running >>> the command >>> hadoop namenode -format >>> >>> When you open this folder >>> >>> >>> you will see two subfolders dfs and mapred. >>> >>> the /home/training/hadoop/mapred folder will be on HDFS also >>> >>> Hope this clears >>> Regards, >>> Som Shekhar Sharma >>> +91-8197243810 >>> >>> >>> >>> On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte >>> wrote: >>> > Hi, >>> > >>> > Make sure to also set mapred.local.dir to the same set of output >>> > directories, this is were the intermediate key-value pairs are stored= ! >>> > >>> > Regards, Dieter >>> > >>> > >>> > 2013/12/16 Tao Xiao >>> >> >>> >> I have ten disks per node,and I don't know what value I should set t= o >>> >> "hadoop.tmp.dir". Some said this property refers to a location in >>> local disk >>> >> while some other said it refers to a directory in HDFS. I'm confused= , >>> who >>> >> can explain it ? >>> >> >>> >> I want to spread I/O since I have ten disks per node, so should I se= t >>> a >>> >> comma-separated list of directories (which are on different disks) t= o >>> >> "hadoop.tmp.dir" ? >>> > >>> > >>> >>> >>> >> >> > --047d7b6d9f62179f4504edb17f83 Content-Type: text/html; charset=GB2312 Content-Transfer-Encoding: quoted-printable
yes, hadoop.tmp.dir is both local and hdfs .


2013/12/17 Raviteja C= hirala <rtejac@gmail.com>
If I am not wrong, hadoop.tmp.dir is both local and hdfs. Whatever mou= nt dir, create same in hdfs. 

Sent from Mailbox for iPad


On Mon, Dec 16, 2013 at 5:05 PM, Tao = Xiao <xiaotao.cs.nju@gmail.com> wrote:

Thanks very much, I suppose I know what I should do with&n= bsp;


On Mon, Dec 16, 2013 at 5:27 PM, Vinayak= umar B <vinayakumar.b@huawei.com> wrote:

Hi,<= /p>

 

hadoop.tmp.dir<= /b> is not the exact configuration you are looki= ng for spreading the disk I/O

 

This is the default base = directory ( its single directory not multiple) used in case you didn’= t configure your own directories for processes such as NameNode, DataNode and NodeManager.

 

Exact configurations wher= e you need to configure comma separated values are as follows.

1.       dfs.namenode.= name.dir for  namenode in hdfs-site.xml

2.       dfs.datanode.= data.dir for datanode in hdfs-site.xml

3.       yarn.nodemana= ger.local-dirs for NodeManager in yarn-site.xml

 

Please note all above con= figurations are for Hadoop 2.x

 

Configure different subdi= rectories if you are using same disk for multiple processes.

    &= nbsp;           Ex: /hado= op/data1/dfs/data

    &= nbsp;           &nbs= p;       And

    &= nbsp;           &nbs= p;    /hadoop/data1/yarn/nm-local-dir

 

 

Cheers,

Vinayakumar B

From: Tao Xiao= [mailto:xiao= tao.cs.nju@gmail.com]
Sent: 16 December 2013 14:42
To: user@hadoop.apache.org
Sub= ject: Re: How to set "hadoop.tmp.dir" if I have multiple disk= s per node?

 

Thanks.

In order to spread I/O among multiple disks, should I assign a comma-separa= ted list of directories which are located on different disks to "hadoo= p.tmp.dir"
=A3=BF

for example,

 <property>

      <name>hadoop.tmp.di= r</name>

      <value>/mnt/disk1/h= adoop_tmp_dir,/mnt/disk2/hadoop_tmp_dir,/mnt/disk3/hadoop_tmp_dir</value= >

 </property>

 <= /p>

2013/12/16 Shekhar Sharma <shekhar2581@gmail.com>

hadoop.tmp.dir is a directory created on local file = system
For example if you have set hadoop.tmp.dir property to /home/training/hadoo= p

This directory will be created when you format the namenode by running
the command
hadoop namenode -format

When you open this folder


you will see two subfolders dfs and mapred.

the /home/training/hadoop/mapred folder will be on HDFS also

Hope this clears
Regards,
Som Shekhar Sharma
+91-8197243810



On Mon, Dec 16, 2013 at 1:42 PM, Dieter De Witte <drdwitte@gmail.com> wrote:
> Hi,
>
> Make sure to also set mapred.local.dir to the same set of output
> directories, this is were the intermediate key-value pairs are stored!=
>
> Regards, Dieter
>
>
> 2013/12/16 Tao Xiao <xiaotao.cs.nju@gmail.com>
>>
>> I have ten disks per node,and I don't know what value I should= set to
>> "hadoop.tmp.dir". Some said this property refers to a lo= cation in local disk
>> while some other said it refers to a directory in HDFS. I'm co= nfused, who
>> can explain it ?
>>
>> I want to spread I/O since I have ten disks per node, so should I = set a
>> comma-separated list of directories (which are on different disks)= to
>> "hadoop.tmp.dir" ?
>
>

 




--047d7b6d9f62179f4504edb17f83--