Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 382F717D4E for ; Mon, 29 Sep 2014 18:54:36 +0000 (UTC) Received: (qmail 9770 invoked by uid 500); 29 Sep 2014 18:54:29 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 9659 invoked by uid 500); 29 Sep 2014 18:54:29 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 9649 invoked by uid 99); 29 Sep 2014 18:54:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Sep 2014 18:54:29 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of navaz.enc@gmail.com designates 209.85.214.179 as permitted sender) Received: from [209.85.214.179] (HELO mail-ob0-f179.google.com) (209.85.214.179) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Sep 2014 18:54:24 +0000 Received: by mail-ob0-f179.google.com with SMTP id wp4so208632obc.10 for ; Mon, 29 Sep 2014 11:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=user-agent:date:subject:from:to:message-id:thread-topic:references :in-reply-to:mime-version:content-type; bh=j/33qC8Xb+UjpkO7A2CRxu7K1woJpJGLpvVucvImk2U=; b=kULj9078PJenBZwV+ZsPqWPKmJGR+L6UjE/wI7ZHvabcrDxYNgJtVm0SuX/ABfIkwr JRz7zYs/tgSV8QSI/eG373yl0FMxlWxilDtvWrted5uiCiCapsV2LGDIEALG2BiD3Ts8 yYQzaP1hq1TIfm997HOBZzBfeTOFVX+/vg06BqRm87VL7iOHrlsWEcZH3JrJbY7nImte gKGMzJtVB0LV5JTSqJVfLyT+NQSJZbi5v0mQktdXCryGzbEXIKYxsygPf9sVE6HbJJo7 N/Rk1LJ0zm5tqIajJ0XNMpAOur50LTy02Xy0uNhNKcNy7D8e3c/AGJuCiQIrXkYpR9fZ DiZg== X-Received: by 10.182.71.104 with SMTP id t8mr5214655obu.71.1412016843478; Mon, 29 Sep 2014 11:54:03 -0700 (PDT) Received: from [172.25.236.68] ([129.7.135.236]) by mx.google.com with ESMTPSA id h1sm523513obw.21.2014.09.29.11.54.01 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 29 Sep 2014 11:54:02 -0700 (PDT) User-Agent: Microsoft-MacOutlook/14.4.3.140616 Date: Mon, 29 Sep 2014 13:53:57 -0500 Subject: Re: No space when running a hadoop job From: Abdul Navaz To: Message-ID: Thread-Topic: No space when running a hadoop job References: <8A72D3D6-43D1-4D2A-BF27-958287A25F48@gmail.com> In-Reply-To: Mime-version: 1.0 Content-type: multipart/alternative; boundary="B_3494843642_28789822" X-Virus-Checked: Checked by ClamAV on apache.org > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3494843642_28789822 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable Dear All, I am not doing load balancing here. I am just copying a file and it is throwing me an error no space left on the device. hduser@dn1:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda2 5.9G 5.1G 533M 91% / udev 98M 4.0K 98M 1% /dev tmpfs 48M 196K 48M 1% /run none 5.0M 0 5.0M 0% /run/lock none 120M 0 120M 0% /run/shm 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 116G 67G 64% /groups/ch-geni-net/Hadoop-NET 172.17.253.254:/q/proj/ch-geni-net 198G 116G 67G 64% /proj/ch-geni-net /dev/xvda4 7.9G 147M 7.4G 2% /mnt hduser@dn1:~$=20 hduser@dn1:~$=20 hduser@dn1:~$=20 hduser@dn1:~$ cp data2.txt data3.txt cp: writing `data3.txt': No space left on device cp: failed to extend `data3.txt': No space left on device hduser@dn1:~$=20 I guess by default it is copying to default location. Why I am getting this error ? How can I fix this ? Thanks & Regards, Abdul Navaz Research Assistant University of Houston Main Campus, Houston TX Ph: 281-685-0388 From: Aitor Cedres Reply-To: Date: Monday, September 29, 2014 at 7:53 AM To: Subject: Re: No space when running a hadoop job I think they way it works when HDFS has a list in dfs.datanode.data.dir, it's basically a round robin between disks. And yes, it may not be perfect balanced cause of different file sizes. On 29 September 2014 13:15, Susheel Kumar Gadalay wrote: > Thank Aitor. >=20 > That is what is my observation too. >=20 > I added a new disk location and manually moved some files. >=20 > But if 2 locations are given at the beginning itself for > dfs.datanode.data.dir, will hadoop balance the disks usage, if not > perfect because file sizes may differ. >=20 > On 9/29/14, Aitor Cedres wrote: >> > Hi Susheel, >> > >> > Adding a new directory to =B3dfs.datanode.data.dir=B2 will not balance you= r >> > disks straightforward. Eventually, by HDFS activity (deleting/invalida= ting >> > some block, writing new ones), the disks will become balanced. If you = want >> > to balance them right after adding the new disk and changing the >> > =B3dfs.datanode.data.dir=B2 >> > value, you have to shutdown the DN and manually move (mv) some files i= n the >> > old directory to the new one. >> > >> > The balancer will try to balance the usage between HDFS nodes, but it = won't >> > care about "internal" node disks utilization. For your particular case= , the >> > balancer won't fix your issue. >> > >> > Hope it helps, >> > Aitor >> > >> > On 29 September 2014 05:53, Susheel Kumar Gadalay >> > wrote: >> > >>> >> You mean if multiple directory locations are given, Hadoop will >>> >> balance the distribution of files across these different directories= . >>> >> >>> >> But normally we start with 1 directory location and once it is >>> >> reaching the maximum, we add new directory. >>> >> >>> >> In this case how can we balance the distribution of files? >>> >> >>> >> One way is to list the files and move. >>> >> >>> >> Will start balance script will work? >>> >> >>> >> On 9/27/14, Alexander Pivovarov wrote: >>>> >> > It can read/write in parallel to all drives. More hdd more io spe= ed. >>>> >> > On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" >>>> >>>> >> > wrote: >>>> >> > >>>>> >> >> Correct me if I am wrong. >>>>> >> >> >>>>> >> >> Adding multiple directories will not balance the files distribu= tions >>>>> >> >> across these locations. >>>>> >> >> >>>>> >> >> Hadoop will add exhaust the first directory and then start usin= g the >>>>> >> >> next, next .. >>>>> >> >> >>>>> >> >> How can I tell Hadoop to evenly balance across these directorie= s. >>>>> >> >> >>>>> >> >> On 9/26/14, Matt Narrell wrote: >>>>>> >> >> > You can add a comma separated list of paths to the >>>>> >> >> =B3dfs.datanode.data.dir=B2 >>>>>> >> >> > property in your hdfs-site.xml >>>>>> >> >> > >>>>>> >> >> > mn >>>>>> >> >> > >>>>>> >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz >>>>>> >> >> > wrote: >>>>>> >> >> > >>>>>>> >> >> >> Hi >>>>>>> >> >> >> >>>>>>> >> >> >> I am facing some space issue when I saving file into HDFS and/or >>>>>>> >> >> >> running >>>>>>> >> >> >> map reduce job. >>>>>>> >> >> >> >>>>>>> >> >> >> root@nn:~# df -h >>>>>>> >> >> >> Filesystem Size Use= d Avail >>> >> Use% >>>>>>> >> >> >> Mounted on >>>>>>> >> >> >> /dev/xvda2 5.9G 5.9= G 0 >>> >> 100% >>>>>>> >> >> >> / >>>>>>> >> >> >> udev 98M 4.0= K 98M >>> >> 1% >>>>>>> >> >> >> /dev >>>>>>> >> >> >> tmpfs 48M 192= K 48M >>> >> 1% >>>>>>> >> >> >> /run >>>>>>> >> >> >> none 5.0M = 0 5.0M >>> >> 0% >>>>>>> >> >> >> /run/lock >>>>>>> >> >> >> none 120M = 0 120M >>> >> 0% >>>>>>> >> >> >> /run/shm >>>>>>> >> >> >> overflow 1.0M 4.0= K 1020K >>> >> 1% >>>>>>> >> >> >> /tmp >>>>>>> >> >> >> /dev/xvda4 7.9G 147= M 7.4G >>> >> 2% >>>>>>> >> >> >> /mnt >>>>>>> >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 108= G 75G >>> >> 59% >>>>>>> >> >> >> /groups/ch-geni-net/Hadoop-NET >>>>>>> >> >> >> 172.17.253.254:/q/proj/ch-geni-net 198G 108= G 75G >>> >> 59% >>>>>>> >> >> >> /proj/ch-geni-net >>>>>>> >> >> >> root@nn:~# >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> I can see there is no space left on /dev/xvda2. >>>>>>> >> >> >> >>>>>>> >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or= do I >>>>>>> >> >> >> need >>>>>>> >> >> >> to >>>>>>> >> >> >> move the file manually from /dev/xvda2 to xvda4 ? >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> >>>>>>> >> >> >> Thanks & Regards, >>>>>>> >> >> >> >>>>>>> >> >> >> Abdul Navaz >>>>>>> >> >> >> Research Assistant >>>>>>> >> >> >> University of Houston Main Campus, Houston TX >>>>>>> >> >> >> Ph: 281-685-0388 >>>>>>> >> >> >> >>>>>> >> >> > >>>>>> >> >> > >>>>> >> >> >>>> >> > >>> >> >> > --B_3494843642_28789822 Content-type: text/html; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable
Dear Al= l,

I am not doing load balancing here. I am just copying a fi= le and it is throwing me an error no space left on the device.

hduser@dn1:= ~$ df -h

Filesystem     &n= bsp;                     &= nbsp;           Size  Used Avail Use% Mounted = on

/dev/xvda2      = ;                     &nbs= p;           5.9G  5.1G  533M  91% /=

udev      &n= bsp;                     &= nbsp;                 98M  4.0K=   98M   1% /dev

tmpfs       =                      =                 48M  196K &nbs= p; 48M   1% /run

none       &= nbsp;                     =                 5.0M     0=   5.0M   0% /run/lock

none       &= nbsp;                     =                 120M     0=   120M   0% /run/shm

172.17.253.254:/q/groups/ch= -geni-net/Hadoop-NET  198G  116G   67G  64% /groups/ch-g= eni-net/Hadoop-NET

172.17.253.254:/q/proj/ch-g= eni-net               198G  116G &nb= sp; 67G  64% /proj/ch-geni-net

/dev/xvda4      = ;                     &nbs= p;           7.9G  147M  7.4G   2% /= mnt

hduser@dn1:~$ 

hduser@dn1:~$ 

hduser@dn1:~$ 

hduser@dn1:~$ cp data2.txt = data3.txt

cp: writing `data3.txt': No= space left on device

cp: failed to extend `data3= .txt': No space left on device

hduser@dn1:~$ 


I guess by default it is copying to default location. Why I= am getting this error ? How can I fix this ? 


Thanks & Regards,

Abdul Navaz
Research Assistant
University of Houston Main Campus, Hou= ston TX
Ph: 281-685-0388


<= span style=3D"font-weight:bold">From: Aitor Cedres <acedres@pivotal.io>
Reply-To: <user@had= oop.apache.org>
Date: Monda= y, September 29, 2014 at 7:53 AM
To: <user@hadoop.apache.org>= ;
Subject: Re: No space when runni= ng a hadoop job


I think they= way it works when HDFS has a list in dfs.datanode.data.dir, it's basic= ally a round robin between disks. And yes, it may not be perfect balanced ca= use of different file sizes.


On 29 September 2014 13:15, = Susheel Kumar Gadalay <skgadalay@gmail.com> wrote:
Thank Aitor.

That is what is my observation too.

I added a new disk location and manually moved some files.

But if 2 locations are given at the beginning itself for
dfs.datanode.data.dir, will hadoop balance the disks usage, if not
perfect because file sizes may differ.

On 9/29/14, Aitor Cedres <acedres@pi= votal.io> wrote:
> Hi Susheel,
>
> Adding a new directory to “dfs.datanode.data.dir” will not= balance your
> disks straightforward. Eventually, by HDFS activity (deleting/invalida= ting
> some block, writing new ones), the disks will become balanced. If you = want
> to balance them right after adding the new disk and changing the
> “dfs.datanode.data.dir”
> value, you have to shutdown the DN and manually move (mv) some files i= n the
> old directory to the new one.
>
> The balancer will try to balance the usage between HDFS nodes, but it = won't
> care about "internal" node disks utilization. For your particular case= , the
> balancer won't fix your issue.
>
> Hope it helps,
> Aitor
>
> On 29 September 2014 05:53, Susheel Kumar Gadalay <skgadalay@gmail.com>
> wrote:
>
>> You mean if multiple directory locations are given, Hadoop will >> balance the distribution of files across these different directori= es.
>>
>> But normally we start with 1 directory location and once it is
= >> reaching the maximum, we add new directory.
>>
>> In this case how can we balance the distribution of files?
>>
>> One way is to list the files and move.
>>
>> Will start balance script will work?
>>
>> On 9/27/14, Alexander Pivovarov <apivovarov@gmail.com> wrote:
>> > It can read/write in parallel to all drives. More hdd more io= speed.
>> >  On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" <skgadalay@gmail.com>
>> > wrote:
>> >
>> >> Correct me if I am wrong.
>> >>
>> >> Adding multiple directories will not balance the files di= stributions
>> >> across these locations.
>> >>
>> >> Hadoop will add exhaust the first directory and then star= t using the
>> >> next, next ..
>> >>
>> >> How can I tell Hadoop to evenly balance across these dire= ctories.
>> >>
>> >> On 9/26/14, Matt Narrell <matt.narrell@gmail.com> wrote:
>> >> > You can add a comma separated list of paths to the >> >> “dfs.datanode.data.dir”
>> >> > property in your hdfs-site.xml
>> >> >
>> >> > mn
>> >> >
>> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <navaz.enc@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Hi
>> >> >>
>> >> >> I am facing some space issue when I saving file = into HDFS and/or
>> >> >> running
>> >> >> map reduce job.
>> >> >>
>> >> >> root@nn:~# df -h
>> >> >> Filesystem          &nb= sp;                     &n= bsp;      Size  Used Avail
>> Use%
>> >> >> Mounted on
>> >> >> /dev/xvda2          &nb= sp;                     &n= bsp;      5.9G  5.9G     0
>> 100%
>> >> >> /
>> >> >> udev            &n= bsp;                     &= nbsp;           98M  4.0K   98M
>>  1%
>> >> >> /dev
>> >> >> tmpfs            &= nbsp;                     =            48M  192K   48M
>>  1%
>> >> >> /run
>> >> >> none            &n= bsp;                     &= nbsp;          5.0M     0  5.0M=
>>  0%
>> >> >> /run/lock
>> >> >> none            &n= bsp;                     &= nbsp;          120M     0  120M=
>>  0%
>> >> >> /run/shm
>> >> >> overflow           = ;                     &nbs= p;        1.0M  4.0K 1020K
>>  1%
>> >> >> /tmp
>> >> >> /dev/xvda4          &nb= sp;                     &n= bsp;      7.9G  147M  7.4G
>>  2%
>> >> >> /mnt
>> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET&= nbsp; 198G  108G   75G
>> 59%
>> >> >> /groups/ch-geni-net/Hadoop-NET
>> >> >> 172.17.253.254:/q/proj/ch-geni-net    =            198G  108G   75G
= >> 59%
>> >> >> /proj/ch-geni-net
>> >> >> root@nn:~#
>> >> >>
>> >> >>
>> >> >> I can see there is no space left on /dev/xvda2.<= br> >> >> >>
>> >> >> How can I make hadoop to see newly mounted /dev/= xvda4 ? Or do I
>> >> >> need
>> >> >> to
>> >> >> move the file manually from /dev/xvda2 to xvda4 = ?
>> >> >>
>> >> >>
>> >> >>
>> >> >> Thanks & Regards,
>> >> >>
>> >> >> Abdul Navaz
>> >> >> Research Assistant
>> >> >> University of Houston Main Campus, Houston TX >> >> >> Ph: 281-685-0388
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>>
>

--B_3494843642_28789822--