Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5F49E1167A for ; Tue, 13 May 2014 16:32:55 +0000 (UTC) Received: (qmail 19303 invoked by uid 500); 13 May 2014 16:04:54 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 19191 invoked by uid 500); 13 May 2014 16:04:54 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 19184 invoked by uid 99); 13 May 2014 16:04:54 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 May 2014 16:04:54 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of aperez@pragsis.com designates 80.26.83.175 as permitted sender) Received: from [80.26.83.175] (HELO srvzimbra01.pragsis.local) (80.26.83.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 May 2014 16:04:51 +0000 Received: from localhost (localhost [127.0.0.1]) by srvzimbra01.pragsis.local (Postfix) with ESMTP id 3BFFD580ADF for ; Tue, 13 May 2014 18:04:25 +0200 (CEST) Received: from srvzimbra01.pragsis.local ([127.0.0.1]) by localhost (srvzimbra01.pragsis.local [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id vOjL1UMdG3yO for ; Tue, 13 May 2014 18:04:18 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by srvzimbra01.pragsis.local (Postfix) with ESMTP id E68B2580BDC for ; Tue, 13 May 2014 18:04:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at srvzimbra01.pragsis.local Received: from srvzimbra01.pragsis.local ([127.0.0.1]) by localhost (srvzimbra01.pragsis.local [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id lYN0P5K5geWM for ; Tue, 13 May 2014 18:04:17 +0200 (CEST) Received: from [192.168.10.18] (pctyrion18.pragsis.local [192.168.10.18]) by srvzimbra01.pragsis.local (Postfix) with ESMTPSA id 9B108580ADF for ; Tue, 13 May 2014 18:04:17 +0200 (CEST) Message-ID: <53724281.1000204@pragsis.com> Date: Tue, 13 May 2014 18:04:17 +0200 From: Aitor Perez Cedres User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: user@hadoop.apache.org Subject: Re: Data node with multiple disks References: <7CF68656-A1E4-4C02-8F32-2DFF8C04CD05@gmail.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------000502020907050604040309" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------000502020907050604040309 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable If you specify a list in the property "dfs.datanode.data.dir" hadoop=20 will distribute the data blocks among all those disks; it will not=20 replicate data between them. If you want to use the disks as a single=20 one you gotta make a LVM array or any other solution to present them as=20 a single one to the OS. However, benchmarks prove that specifying a list of disks and letting=20 hadoop distribute data among them gives better performance. On 13/05/14 17:12, Marcos Sousa wrote: > Yes, > > I don't want to replicate, just use as one disk? Isn't possible to=20 > make this work? > > Best regards, > > Marcos > > > On Tue, May 13, 2014 at 6:55 AM, Rahul Chaudhari=20 > >=20 > wrote: > > Marcos, > While configuring hadoop, the "dfs.datanode.data.dir" property > in hdfs-default.xml should have this list of disks specified on > separate line. If you specific comma separated list, it will > replicate on all those disks/partitions. > > _Rahul > Sent from my iPad > > > On 13-May-2014, at 12:22 am, Marcos Sousa > > > wrote: > > > > Hi, > > > > I have 20 servers with 10 HD with 400GB SATA. I'd like to use > them to be my datanode: > > > > /vol1/hadoop/data > > /vol2/hadoop/data > > /vol3/hadoop/data > > /volN/hadoop/data > > > > How do user those distinct discs not to replicate? > > > > Best regards, > > > > -- > > Marcos Sousa > > > > > --=20 > Marcos Sousa > www.marcossousa.com Enjoy it! --=20 *Aitor P=C3=A9rez* /Big Data System Engineer/ Telf.: +34 917 680 490 Fax: +34 913 833 301 C/Manuel Tovar, 49-53 - 28034 Madrid - Spain _http://www.bidoop.es_ --------------000502020907050604040309 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
If you specify a list in the property "dfs.datanode.data.dir" hadoop will distribute the data blocks among all those disks; it will not replicate data between them. If you want to use the disks as a single one you gotta make a LVM array or any other solution to present them as a single one to the OS.

However, benchmarks prove that specifying a list of disks and letting hadoop distribute data among them gives better performance.
On 13/05/14 17:12, Marcos Sousa wrote:=
Yes,

I don't want to replicate, just use as one disk? Isn't possible to make this work?

Best regards,

Marcos


On Tue, May 13, 2014 at 6:55 AM, Rahul Chaudhari <rahulchaudhari0405@gmail.com> wrote:
Marcos,
=C2=A0 =C2=A0 While configuring hadoop, the "dfs.datanode.dat= a.dir" property in hdfs-default.xml should have this list of disks specified on separate line. If you specific comma separated list, it will replicate on all those disks/partitions.

_Rahul
Sent from my iPad

> On 13-May-2014, at 12:22 am, Marcos Sousa <falecomigo@m= arcossousa.com> wrote:
>
> Hi,
>
> I have 20 servers with 10 HD with 400GB SATA. I'd like to use them to be my datanode:
>
> /vol1/hadoop/data
> /vol2/hadoop/data
> /vol3/hadoop/data
> /volN/hadoop/data
>
> How do user those distinct discs not to replicate? >
> Best regards,
>
> --
> Marcos Sousa



--
Marcos Sousa
w= ww.marcossousa.com Enjoy it!

--
Aitor P=C3=A9re= z
Big Data System Engineer

Telf.: +34 917 680 490
Fax: +34 913 833 301
C/Manuel Tovar, 49-53 - 28034 Madrid - Spain

http://www.bidoop.es<= /a>

--------------000502020907050604040309--